Closed inodb closed 4 years ago
Hi Ino @inodb Thanks for the PR! I like this idea.
I would start the function right after cancer_file
(the result of downloadStudy
).
Please rename it to something like importStudy
or loadStudy
.
You'd have to make sure that this function is documented and exported for users that
ran the downloadStudy
step.
I can help with this.
Also, make sure you have merged the latest changes from the master
branch.
--Marcel
@LiNk-NY Thanks for reviewing! I tried to address your comments
For this comment:
I would start the function right after cancer_file (the result of downloadStudy).
I didn't fully understand what you meant by that. For the use case of the datahub action, I needed a function that worked directly on the untarred files. I created a separate function untarStudy
that does the untarring bit. Lmk if that addresses that comment or if you had something else in mind
Regarding the exporting/documenting I'm not 100% sure how to do it. Can you point me to an example of approx what I should add to get it to export & document properly. Thanks so much!
Hi Ino, @inodb
untarStudy
looks good. Thanks for doing that. Keeping these steps modular is good.
I'll take care of the documentation and other minor details. Thanks!
@LiNk-NY Perfect! Thank you!
This will allow calling of just the parsing function
extractedStudyFolder2MultiAssayExperiment(filepath)
in addition to the already existing downloading+parsing in one function (cBioDataPack
). This is part of the effort to run testing as a github action on the datahub repo directly. On datahub we won't need to download the study, only need to see if we can parse itFeel free to rename/organize as preferred
This is an example of the corresponding github action on the datahub repo: https://github.com/cBioPortal/datahub/pull/1222. Right now it allows you to semi-manually test a study using the github actions interface. In the future we can maybe add an action to run on each PR to datahub where we only test studies that are changed in the diff.