broadinstitute / gdctools

Python and UNIX CLI utilities to simplify interaction with the NIH/NCI Genomics Data Commons
Other
31 stars 4 forks source link

gdc_loadfile sometimes does nothing #48

Closed gsaksena closed 6 years ago

gsaksena commented 7 years ago

That was the case with the attached config file

gsaksena commented 7 years ago

methylation.cfg.txt

noblem commented 6 years ago

This is a few months old, and after a flurry of recent activity I checked to see if it was still relevant--with the good news being that it is no longer relevant and can be closed. With todays code, when I attempt to use the file "as is" on cga3 the tool tells you what's wrong:

gdc_loadfile --config methylation.cfg
Required config variable is unset: loadfile.dir

That is "something," so the tool is decidedly not doing "nothing".

Moreover, the particular complaint is the result of the loadfile section being named improperly, perhaps by cut/paste from a very old example? This made the tool think that the section it was looking for was in fact not defined at all, because in your attached file the section is named [loadfiles] but it should be [loadfile]; the name of each tool section in the config file should correspond to the tool it purports to configure: e.g. [loadfile] for gdc_loadfile, [mirror] for gdc_mirror etc.

If I correct the section name to [loadfile] and also tweak the ROOT_DIR value to

[DEFAULT]
ROOT_DIR: /broad/hptmp/mnoble/loadfile_test

(so that I will have permissions to write to it), then rerun

gdc_loadfile --config meth.cfg.edit1

I get an appropriate error

ERROR:root:Create Loadfile FAILED:
... snip ...
ValueError: No datestamps found, use upstream tool first

because the ROOT_DIR does not contain dicing datestamps (it's empty!). So the tool is suggesting that you use an upstream tool (e.g. gdc_dice, and maybe even gdc_mirror) FIRST before attempting to generate loadfiles. Another way to remove that snag is to change ROOT_DIR so that the tool CAN find the dicing it wants ... SOOOOOOO, if I change ROOT_DIR and LOADFILE_DIR as follows

[DEFAULT]
ROOT_DIR: /xchip/gdac_data/gdc
...
[loadfile]
DIR: /broad/hptmp/mnoble/loadfiles

a dicing commences as requested:


gdc_loadfile --config meth.cfg.edit2
2017-09-16 23:16:18,962[INFO]: Inspecting data for TCGA-BRCA with version datestamp 2017_09_16
2017-09-16 23:16:19,328[INFO]: Inspecting data for TCGA-COAD with version datestamp 2017_09_16
2017-09-16 23:16:19,485[INFO]: Inspecting data for TCGA-GBM with version datestamp 2017_09_16
2017-09-16 23:16:19,634[INFO]: Inspecting data for TCGA-TGCT with version datestamp 2017_09_16
2017-09-16 23:16:19,685[INFO]: Inspecting data for TCGA-UCS with version datestamp 2017_09_16
2017-09-16 23:16:19,707[INFO]: Generating loadfile for TCGA-BRCA
2017-09-16 23:16:19,731[INFO]: Writing cases loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-BRCA.Participant.loadfile.txt
2017-09-16 23:16:19,734[INFO]: Writing samples loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-BRCA.Sample.loadfile.txt
2017-09-16 23:16:19,736[INFO]: Writing filtered samples to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-BRCA.filtered_samples.txt
2017-09-16 23:16:20,289[INFO]: Writing sample set loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-BRCA.Sample_Set.loadfile.txt
2017-09-16 23:16:20,391[INFO]: Generating loadfile for TCGA-COAD
2017-09-16 23:16:20,392[INFO]: Writing cases loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-COAD.Participant.loadfile.txt
2017-09-16 23:16:20,395[INFO]: Writing samples loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-COAD.Sample.loadfile.txt
2017-09-16 23:16:20,398[INFO]: Writing filtered samples to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-COAD.filtered_samples.txt
2017-09-16 23:16:20,632[INFO]: Writing sample set loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-COAD.Sample_Set.loadfile.txt
2017-09-16 23:16:20,692[INFO]: Generating loadfile for TCGA-GBM
2017-09-16 23:16:20,692[INFO]: Writing cases loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-GBM.Participant.loadfile.txt
2017-09-16 23:16:20,694[INFO]: Writing samples loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-GBM.Sample.loadfile.txt
2017-09-16 23:16:20,697[INFO]: Writing filtered samples to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-GBM.filtered_samples.txt
2017-09-16 23:16:20,899[INFO]: Writing sample set loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-GBM.Sample_Set.loadfile.txt
2017-09-16 23:16:20,965[INFO]: Generating loadfile for TCGA-TGCT
2017-09-16 23:16:20,965[INFO]: Writing cases loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-TGCT.Participant.loadfile.txt
2017-09-16 23:16:20,968[INFO]: Writing samples loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-TGCT.Sample.loadfile.txt
2017-09-16 23:16:20,970[INFO]: Writing filtered samples to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-TGCT.filtered_samples.txt
2017-09-16 23:16:21,043[INFO]: Writing sample set loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-TGCT.Sample_Set.loadfile.txt
2017-09-16 23:16:21,066[INFO]: Generating loadfile for TCGA-UCS
2017-09-16 23:16:21,066[INFO]: Writing cases loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-UCS.Participant.loadfile.txt
2017-09-16 23:16:21,069[INFO]: Writing samples loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-UCS.Sample.loadfile.txt
2017-09-16 23:16:21,071[INFO]: Writing filtered samples to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-UCS.filtered_samples.txt
2017-09-16 23:16:21,100[INFO]: Writing sample set loadfile to /broad/hptmp/mnoble/loadfiles/TCGA/2017_09_16/TCGA-UCS.Sample_Set.loadfile.txt
2017-09-16 23:16:21,114[INFO]: Generating pan-cohort loadfiles for TCGA```