Open giacomoguiduzzi opened 2 weeks ago
This issue had no activity for 14 days. It will be closed in 1 week unless there is some new activity. Is this issue already resolved?
Dear Giacomo Guiduzzi,
Thank you for reaching out and sharing your observations about sequence-missing and block-missing behaviour in PyGrinder. The behaviour you’ve described could be due to an interaction between the existing missing data in your dataset and the additional missingness introduced.
If your dataset already contains missing values, the new missing values added will mix with the original ones. This blending effect could result in the observed actual missing rate being lower than the specified value. This issue is particularly noticeable when there are fewer completely observed sequences or blocks in the data to begin with.
Please let me know if this explanation aligns with your situation, or feel free to provide more details about your dataset or experimental setup, and I’d be happy to assist further.
Best regards,
linglong
Issue description
Greetings,
I'm working on a project related to forecasting time series with Deep Learning methods. A quick question about sequence missing and block missing from PyGrinder: I noticed that when I set a replace_pct value of 0.5 I am not actually getting around 50% of missing values, but 39%. If I raise this value to 0.75 then I get around 50%. Is this normal? Am I missing something? Let me know if there is any additional information I can give you regarding this behaviour. Thanks in advance, I'm looking forward to your kind response.
Best Regards, Giacomo Guiduzzi