AlexsLemonade / OpenScPCA-analysis

An open, collaborative project to analyze data from the Single-cell Pediatric Cancer Atlas (ScPCA) Portal
Other
8 stars 16 forks source link

Improve Wilms Tumor Dataset Annotation (SCPCP000006) - explore `predicted.score` and `has_cnv.score` thresholds #856

Open maud-p opened 14 hours ago

maud-p commented 14 hours ago

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

This issue follows the PR https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/844 and the 2 comments:

Describe the goals of the changes to the analysis module.

I would like to explore difefrent thresholds for filtering and annotating based on the predicted.score and cnv.score. I would like to:

What will your pull request contain?

Few changes in the 07 notebook

Will you require additional software beyond what is already in the analysis module?

No response

Will you require different computational resources beyond what the analysis module already uses?

No response

If known, when do you expect to file the pull request?

~ November

sjspielman commented 14 hours ago

Hi @maud-p, glad to see you back here in issues! I wanted to give you a heads up about continuing this module - I am still working behind the scenes on your module to get it all running in CI. I have updated the label transfer code but it's not yet merged into main (but will be within the next 2 weeks I think 🤞), since I am still working in a separate branch to fix some bugs we are now able to find with all code running in CI. You can see code as we work on it in this branch: https://github.com/AlexsLemonade/OpenScPCA-analysis/tree/feature/wilms-tumor-06-azimuth. While I am still working in my fork, rather than sending PRs to main, I am sending them here. Once this is entirely finished, we'll merge that branch into main.

FYI - one silly (!!!) bug I found is that somehow we never actually applied the score threshold in inferCNV - woops!! So as part of this, I am making sure we use the threshold in that script too!

I think that working on the module while I am still doing this will result in _a lot_of conflicts which will be very challenging to resolve. Also, the results will slightly change because of the new label transfer code, and the actual use of the 0.85 threshold in inferCNV, which will also complicate interpretation and validation. Are you able to wait a few weeks before doing these additional analyses? I will certainly keep you updated as I continue this process!

maud-p commented 14 hours ago

@sjspielman thank you for all your efforts in making the analysis run in CI! I understand and I can wait, no problem at all! No rush from my side. I just opened the issues to inform you about the plans and coordinate with you the next steps. Just let me know if/how I can help and when I can start working on the analysis again 😃 Thank you !