lcdb / lcdb-workflows

DEPRECATED. Please see https://github.com/lcdb/lcdb-wf
MIT License
1 stars 0 forks source link

References update 2 #36

Closed daler closed 8 years ago

daler commented 8 years ago
jfear commented 8 years ago

Looked things over and all looks good. Are we still wanting to do one massive config file? In which case it may make sense to move the mapping/config.yaml to test or some other higher level folder.

I deleted references-test and tried running tests. Had this error pop up on yeast genome. Have lab meeting now, but will try trouble shooting after that.

Error in job download_fasta while creating output file /data/LCDB/references-test/sacCer3/sacCer3_default.fa.gz.
RuleException:
TypeError in line 66 of /gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile:
filename must be a str or bytes object, or a file
  File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile", line 110, in __rule_download_fasta
  File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/references.snakefile", line 66, in download_and_postprocess
  File "/gpfs/gsfs4/users/LCDB/users/fearjm/lcdb-workflows/workflows/references/sacCer3.py", line 11, in fasta_postprocess
  File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/tarfile.py", line 1562, in open
  File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/tarfile.py", line 1660, in bz2open
  File "/home/fearjm/opt/miniconda3/envs/lcdb-workflows-fearjm-env/lib/python3.5/bz2.py", line 102, in __init__

Traceback (most recent call last):
  File "test/run_test.py", line 155, in <module>
    sp.check_call(['bash', script_name])
  File "/home/fearjm/opt/miniconda3/lib/python3.5/subprocess.py", line 584, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['bash', 'lcdb-workflows-submit-8307974e.sh']' returned non-zero exit status 1

lcdb-init.py ran cleanly (but needed chmod a+x).

daler commented 8 years ago

I hadn't deleted the entire references-test to re-run (takes a while to rebuild). Looks like the post-process functions need to be updated to support lists rather than single downloaded files. I'll push some updates once I build in a new references dir.

daler commented 8 years ago

Actually the hg19 and dm6 postprocess functions were fine since they were using expansion into shell() placeholders (so lists were correctly appearing as a space-delimited string in the command). It was just the saCer3 one that needed updating.

By the way, you can override a config item from the command line. So instead of deleting the configured dir to force a re-run, try specifying a different one:

snakemake \
  -s references.snakefile \
  --configfile ../mapping/config.yaml \
  --config=data_dir=/new/path/to/references
daler commented 8 years ago

Also, I agree that the config should be in a higher-level dir. I'm working on another branch that does higher-level config; I'll move it in that branch.

jfear commented 8 years ago

Would it be worth adding an option to run_test.py to pass --config command line options directly to snakemake?

For example:

test/run_test.py . --build-env --clean --workflow=workflows/mapping --config=data_dir=/new/path/to/references
daler commented 8 years ago

Already in that other branch :)

jfear commented 8 years ago

Ok everything tested clean. Going to go ahead and merge.

daler commented 8 years ago

OK, thanks. hisat2 on hg19 is taking forever for me on an interactive node. If yours tested clean I'll stop my test.