Closed denbonte closed 1 year ago
@denbonte
The linear model scores should be available already using this download: https://aim-harvard.github.io/foundation-cancer-image-biomarker/user-guide/analysis/#predictions-pipeline
The foundation_features.csv
under HarvardRT folder should contain the the scores on the validation set to get the median split threshold.
As for the features, what do you think? Should we share them? The benefit is we can then also show the linear model training pipeline.
The
foundation_features.csv
under HarvardRT folder should contain the the scores on the validation set to get the median split threshold.
Perfect!
As for the features, what do you think? Should we share them? The benefit is we can then also show the linear model training pipeline.
To show the training pipeline, we would probably have to share a minimal amount of information from the HarvardRT cohort as well (e.g., survival data) - and I'm pretty sure the cohort is not cleared for that. We can discuss, but for the time being I don't see this as crucial!
Added everything in the last couple of commits (https://github.com/ImagingDataCommons/idc-radiomics-reproducibility/commit/434b547cf2d51df2436ae47fed5f94eea263a49e and https://github.com/ImagingDataCommons/idc-radiomics-reproducibility/commit/0d2a7650b7ff39618c6ff003f41d964e795487b8).
Works perfect. I'm going to push the notebook to the repo in a few minutes.
🚀 Feature Request
Add tabled data for the HarvardRT dataset. Of course, we don't need to add PHI (e.g., the clinical data) - but just the features (potentially, not mandatory) and, most importantly, the scores of the linear model (so that whoever wants to replicate the KM analysis or use that model further knows what the median split threshold is).
Once again, I'm happy to have them in the
idc-radiomics-reproducibility
repo instead of here.🔈 Motivation
This would help transparency and reproducibility, and would make the predictive model usable directly by others.