remontoire-pac / ice-cancer-cell-lines

interoperable, integrated CTD^2 cancer cell-line computational environment exemplar
MIT License
3 stars 0 forks source link

Investigate downloading files directly in code rather than readme.md #4

Open remontoire-pac opened 4 years ago

remontoire-pac commented 4 years ago

To minimize work during a build, we should investigate how easy it is to get files directly from the public locations within the code, rather than using the links in the readme.md file. There are at least 4 use cases:

NCI FTP SITE https://github.com/remontoire-pac/ice-cancer-cell-lines/blob/6616d673fedc6df2c0b35b38ac3c03bf6afbb117/code/build/m/onboard001CompoundSensitivity.m#L17-L18

ANOTHER GITHUB REPO https://github.com/remontoire-pac/ice-cancer-cell-lines/blob/6616d673fedc6df2c0b35b38ac3c03bf6afbb117/code/build/m/onboard001CompoundSensitivity.m#L40-L41

DEPMAP PORTAL https://github.com/remontoire-pac/ice-cancer-cell-lines/blob/6616d673fedc6df2c0b35b38ac3c03bf6afbb117/code/build/m/onboard001CompoundSensitivity.m#L75-L76

FIGSHARE LINK https://github.com/remontoire-pac/ice-cancer-cell-lines/blob/6616d673fedc6df2c0b35b38ac3c03bf6afbb117/code/build/m/onboard001CompoundSensitivity.m#L100-L101

sandrine-m commented 4 years ago

As expected the readtable function does not have an option to load from url. I've looked at it earlier and I have a question. I tried successfully loading the data in MATLAB directly from the url (without saving the source files locally). This strategy would require quite a lot of adaptations to the actual code (although I kind of like the fact that we do not have to save the sources). I am looking currently at saving the file first.

remontoire-pac commented 4 years ago

A fine solution for now would be to download and save all of the files in the appropriate location using a setup script that is called early (perhaps in build.m). A better solution would be for each onboarding script to be responsible for only the files that it needs, maintaining their independence.