gitter-lab / LPWC

Lag Penalized Weighted Correlation for Time Series Clustering
https://doi.org/10.1186/s12859-019-3324-1
Other
20 stars 3 forks source link

LPWC simulated data #62

Open antoine4ucsd opened 1 year ago

antoine4ucsd commented 1 year ago

thank you for this interesting approach and for sharing the code. I am really interested in using a similar approach to our dataset. I have a couple of questions:

  1. do you mind sharing the code to generate simulated data ? I really like it and I'd to generate my own with different trajectories.
  2. can LPWC handle missing time points for some sample/gene/variable? or would you recommend spline to impute missing datapoint? thank you!
agitter commented 1 year ago

Thanks for your interest @antoine4ucsd.

  1. The simulations were derived from the impulse model from ImpulseDE. The paper doesn't describe the model in detail but you can see it in the earlier paper or original code: Bioconductor or GitHub. I'm attaching what should be all the relevant files from the private manuscript repository we used to generate figures (from commit 6b8774f1c6b83e7123aaac99f9e20f457f38d6b5). GitHub won't let me attach .R files so rename .txt to .R:

  2. I believe LPWC expects there to be complete data without any missing values. However, I'm not seeing any error checking where we enforce that expectation. In our case studies in the paper we removed rows with missing values because they were a proof of concept. If you want to do a real analysis, that may not be an option. Imputation with a temporally-aware approach (like splines or perhaps even fitting the impulse models above if they fit your data well) would be a reasonable choice.

@thevaachandereng we should document the expectations regarding missing values before closing this issue.

antoine4ucsd commented 1 year ago

thank you so much for sharing these codes and for your detailed response. I will work on it today and keep you updated. Best,