Closed daler closed 8 years ago
Looked things over and all looks good. Are we still wanting to do one massive config file? In which case it may make sense to move the mapping/config.yaml
to test
or some other higher level folder.
I deleted references-test and tried running tests. Had this error pop up on yeast genome. Have lab meeting now, but will try trouble shooting after that.
Error in job download_fasta while creating output file /data/LCDB/references-test/sacCer3/sacCer3_default.fa.gz.
RuleException:
TypeError in line 66 of /gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile:
filename must be a str or bytes object, or a file
File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile", line 110, in __rule_download_fasta
File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile", line 66, in download_and_postprocess
File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/sacCer3.py", line 11, in fasta_postprocess
File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/tarfile.py", line 1562, in open
File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/tarfile.py", line 1660, in bz2open
File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/bz2.py", line 102, in __init__
Traceback (most recent call last):
File "test/run_test.py", line 155, in <module>
sp.check_call(['bash', script_name])
File "/home/fearjm/opt/miniconda3/lib/python3.5/subprocess.py", line 584, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['bash', 'lcdb-workflows-submit-8307974e.sh']' returned non-zero exit status 1
lcdb-init.py ran cleanly (but needed chmod a+x).
I hadn't deleted the entire references-test to re-run (takes a while to rebuild). Looks like the post-process functions need to be updated to support lists rather than single downloaded files. I'll push some updates once I build in a new references dir.
Actually the hg19 and dm6 postprocess functions were fine since they were using expansion into shell()
placeholders (so lists were correctly appearing as a space-delimited string in the command). It was just the saCer3 one that needed updating.
By the way, you can override a config item from the command line. So instead of deleting the configured dir to force a re-run, try specifying a different one:
snakemake \
-s references.snakefile \
--configfile ../mapping/config.yaml \
--config=data_dir=/new/path/to/references
Also, I agree that the config should be in a higher-level dir. I'm working on another branch that does higher-level config; I'll move it in that branch.
Would it be worth adding an option to run_test.py to pass --config
command line options directly to snakemake?
For example:
test/run_test.py . --build-env --clean --workflow=workflows/mapping --config=data_dir=/new/path/to/references
Already in that other branch :)
Ok everything tested clean. Going to go ahead and merge.
OK, thanks. hisat2 on hg19 is taking forever for me on an interactive node. If yours tested clean I'll stop my test.