ZijieZhaoMMHW / m_mhw1.0

A MATLAB toolbox to detect and analyze marine heatwaves (MHWs).
GNU General Public License v3.0
45 stars 19 forks source link

MATLAB warning flags #5

Closed mvdh7 closed 5 years ago

mvdh7 commented 5 years ago

For JOSS review

The MATLAB code for the functions themselves contains many warning messages indicating inefficient or unnecessary constructs. I strongly suggest the authors work through and eliminate these, to improve code legibility, and in some cases, its performance.

ZijieZhaoMMHW commented 5 years ago

Most warning flag is due to the fact that many variables' size would change during loops, so MATLAB automatically suggest parallel computation. But parallel computation is actually not suitable for this case. As what mentioned in manuscript, parallel computation is not suitable here as the size of the resultant MHW/MCS dataset would change with loops, which is against the rule that each loop should be independent of others in a parallel computation. It should be noted that parallel computation could be used if each detected event is stored independently. However, this would mean that another loop should be included to all events into one composite. We have tested this approach and found that it would not be faster than the version using simple loops

mvdh7 commented 5 years ago

My understanding of the JOSS review process is that it's supposed to help improve code not just check that it works. The comments below are given in that spirit, but I am quite happy for them to be ignored if I have misunderstood, and if the Editor feels these are not appropriate or necessary for publication.

Eliminating the warning flags would be a useful exercise especially if combined with improvements to readability and style throughout. The code is not easy to follow for several reasons: (1) there is a lack of comments within the functions describing what is going on; (2) some basic good practices are ignored, such as spaces around = characters; (3) there are some very long lines that should be broken up with ....

  1. The majority of your warning flags are due to unnecessary semi-colons, usually at the end of a for/if/while statement. While these don't cause any problems, they do make it much more difficult to spot other, more useful warnings. They require very little effort to remove.

  2. You then have some unnecessary find statements, of the format:

vector_variable(find(logical_expression))

These work equally well - and marginally faster - if you simply put:

vector_variable(logical_expression)

  1. You have some variables that are assigned within loops that are never used, which can simply be eliminated to avoid wasting memory.

  2. Based on your reply above, I think you may be mixing up "parallel" with "preallocation". MATLAB does not suggest parallel computation in your code, it suggests that you preallocate some variables in your loops. This means that if you are assigning values into a vector one by one in a loop, you should ideally create an empty vector before the loop begins, of the correct size, into which the results are added.

For example, part of your code in detect() looks something like this:

mhwint_max = [];
for le = 1:size(potential_event,1)
    maxanom = <some code>;
    mhwint_max = [mhwint_max; maxanom];
end

To preallocate you would change this to:

mhwint_max = NaN(size(potential_event,1),1);
for le = 1:size(potential_event,1)
    maxanom = <some code>;
    mhwint_max(le) = maxanom;
end

This allows your computer to use memory more efficiently and your code to run faster.

  1. One warning flag simply tells you it is faster to use str2double instead of str2num. This is an extremely straightforward switch. Why resist?
ZijieZhaoMMHW commented 5 years ago

Thanks a lot for careful and thoughtful review. I have edited the script for your suggestion. Thanks a lot. Now most warning flag has been removed.

I need to mention that find still needs to be used in potential_event(find(gaps<=vmaxGap),2)=potential_event(find(gaps<=vmaxGap)+1,2); due to the fact that gaps has different size from potential_event, here I just use find to determine the location that I want to fill in so I have to use find, but MATLAB initially think gaps has the same size with potential_event so it suggests no find. Please see https://github.com/ZijieZhaoMMHW/m_mhw1.0/blob/master/detect.m.

ZijieZhaoMMHW commented 5 years ago

Hi @mvdh7. for your consideration, I send the toolbox to @codacy-badger and get the code quality A. Please see https://github.com/ZijieZhaoMMHW/m_mhw1.0

mvdh7 commented 5 years ago

Nice work. I hope this has helped improve the code.

Thank you for pointing out the code quality tester - I hadn't come across that before. However, I don't think it supports MATLAB, so the badge it has given may not be very meaningful.