WorldCereal / presto-worldcereal

5 stars 0 forks source link

`available_timesteps` might be wrongly computed (and needs some documentation) #111

Open kvantricht opened 1 week ago

kvantricht commented 1 week ago

The determination of available_timesteps is undocumented and therefore not entirely clear for a user on what this means. For example I'm investigating a sample where available_timesteps==11, which is odd when the required number of timesteps is 12. Some documentation could help clarify this.

https://github.com/WorldCereal/presto-worldcereal/blob/c1eadb50a102f99d9c7de152cd3eecfee30a2c9b/presto/utils.py#L325

kvantricht commented 1 week ago

In fact, the sample I'm looking at (2019Usda1053623_1753) has start_date = 2018-12-01 and end_date = 2019-11-30. This is exactly 12 months, but the computation of available_timesteps:

df_pivot["available_timesteps"] = (
        df_pivot["end_date"].dt.year * 12 + df_pivot["end_date"].dt.month
    ) - (df_pivot["start_date"].dt.year * 12 + df_pivot["start_date"].dt.month)

results in 11 timesteps. I guess this formula should do a +1 to be correct?

Converting this issue to a bug as I think something is not working correctly here (might be wrong though).