EarthyScience / REddyProc

Processing data from micrometeorological Eddy-Covariance systems
58 stars 33 forks source link

Provide consistent results with usEstUstarThreshold and sEddyProc_sGetEstimatedUstarThresholdDistribution and #12

Closed bgctw closed 6 years ago

bgctw commented 6 years ago

The output of sEddyProc_sEstUstarThresholdDistribution is a data.frame while the output of sEstUstarThreshold is a list. It makes it confusing now.

Actually, I would expect that if any of these two could be more complex and thus list would be needed though it is not. Maybe you could include in both methods argument fullOutput = FALSE producing just the expected data frame and list when set to TRUE. This way experienced users could explore more stats - if produced for $sEstUstarThresholdDistribution() - that are currently not available.

bgctw commented 6 years ago

The rational is/was to allow the user to inspect single runs more closely than the output of the entire bootstrap. But I agree, that a more consistent result would be preferable.

I argue against providing the detailed list output with sEddyProc_sEstUstarThresholdDistribution because it differs by each bootstrap sample.

I also argue against changing the output format depending on some input argument flags. For me this is even more confusing.

Rather, I suggest providing the reduced data.frame output only (without the quantile columns) also for sEstUstarThreshold, and leave the full output list only inside the function for people actually debugging the code.

lsigut commented 6 years ago

I would agree with the simplification. Thanks! I never used the other elements of the sEstUstarThreshold output than uStarTh.

bgctw commented 6 years ago

When changing the output format of an exported function, this may break existing code of users.

Therefore I suggest

Additionally, we could deprecate the existing function.

The inconsistency makes apparent a further problem of more than a single task of a function:

Currently, the tasks of setting the uStar seasons is a part of usEstUstarThreshold. Seasons are subsequently used in gapfilling for applying different thresholds at different times/seasons. However, they will not be returned by the function any more. I suggest an additional function sEddyProc_setUStarSeasons to set the uStar seasons before calling usEstUstarThresholdSingle instead of providing them as an argument and check in usEstUstarThresholdSingle that the seasons have been set before.

lsigut commented 6 years ago

I have the impression that the naming strategy is too verbose. Adding appendix Single makes it even longer. Though the reason seems obvious now, for future users it will not be once you remove the original ones. I would suggest instead something like sEddyProc_sEstUT and usEstUT. Deprecation sounds good.. I am not aware of all the things happening internally so it might be much more complex than I imagine. In case there are some internal dependencies on the current output structure, maybe you can first try to get rid of those and later come with the new method. But if one more function can solve it, that sounds reasonable. This will be run independently or could be as an argument to sEddyProc_sEstUT?

bgctw commented 6 years ago

UT is not expressive enough to my opinion. The former method name was good. How about replacing by usEstUStarThold?

lsigut commented 6 years ago

Could be fine too. My point was just to make it possibly shorter not longer. I prefer to put names that are possible to memorize to build simple vocabulary for the users.