AIM-Harvard / foundation-cancer-image-biomarker

Code and evaluation repository for the paper
https://aim-harvard.github.io/foundation-cancer-image-biomarker/
MIT License
79 stars 10 forks source link

Add tabled data for HarvardRT #238

Closed denbonte closed 1 year ago

denbonte commented 1 year ago

🚀 Feature Request

Add tabled data for the HarvardRT dataset. Of course, we don't need to add PHI (e.g., the clinical data) - but just the features (potentially, not mandatory) and, most importantly, the scores of the linear model (so that whoever wants to replicate the KM analysis or use that model further knows what the median split threshold is).

Once again, I'm happy to have them in the idc-radiomics-reproducibility repo instead of here.

🔈 Motivation

This would help transparency and reproducibility, and would make the predictive model usable directly by others.

surajpaib commented 1 year ago

@denbonte

The linear model scores should be available already using this download: https://aim-harvard.github.io/foundation-cancer-image-biomarker/user-guide/analysis/#predictions-pipeline

The foundation_features.csv under HarvardRT folder should contain the the scores on the validation set to get the median split threshold.

As for the features, what do you think? Should we share them? The benefit is we can then also show the linear model training pipeline.

denbonte commented 1 year ago

The foundation_features.csv under HarvardRT folder should contain the the scores on the validation set to get the median split threshold.

Perfect!

As for the features, what do you think? Should we share them? The benefit is we can then also show the linear model training pipeline.

To show the training pipeline, we would probably have to share a minimal amount of information from the HarvardRT cohort as well (e.g., survival data) - and I'm pretty sure the cohort is not cleared for that. We can discuss, but for the time being I don't see this as crucial!

denbonte commented 1 year ago

Added everything in the last couple of commits (https://github.com/ImagingDataCommons/idc-radiomics-reproducibility/commit/434b547cf2d51df2436ae47fed5f94eea263a49e and https://github.com/ImagingDataCommons/idc-radiomics-reproducibility/commit/0d2a7650b7ff39618c6ff003f41d964e795487b8).

Works perfect. I'm going to push the notebook to the repo in a few minutes.