ngless-toolkit / ngless

NGLess: NGS with less work
https://ngless.embl.de
Other
142 stars 24 forks source link

Parallel module exits with error, if the data are not in the same directory as the script #141

Closed vpuller closed 3 years ago

vpuller commented 3 years ago

Parallel module exits with error, if the data are not in the same directory as the script. Running the parallel.ngl test exits with error, if data is moved into a subfolder: sample.fq -> samle/sample.fq. Seemingly unable to copy an existing file:

Exiting after fatal error:
An unhandled erorr occurred (this should not happen)!

    If you can reproduce this issue, please run your script
    with the --trace flag and report a bug (including the script and the trace) at
        https://github.com/ngless-toolkit/ngless/issues

The error message was: `/home/puller/ngless_vadim/tmp/partial.compress.tsv18674-5.gz: renameFile:renamePath:rename: does not exist (No such file or directory)`

[Thu 03-12-2020 10:02:11]: # Configuration
[Thu 03-12-2020 10:02:11]:  download base URL: https://ngless.embl.de/resources/
[Thu 03-12-2020 10:02:11]:  global data directory: /home/puller/ngless_vadim/
[Thu 03-12-2020 10:02:11]:  user directory: /home/puller/ngless_vadim/user-data/
[Thu 03-12-2020 10:02:11]:  user data directory: /home/puller/.local/share/ngless/data
[Thu 03-12-2020 10:02:11]:  temporary directory: /home/puller/ngless_vadim/tmp/
[Thu 03-12-2020 10:02:11]:  keep temporary files: False
[Thu 03-12-2020 10:02:11]:  create report: True
[Thu 03-12-2020 10:02:11]:  report directory: parallel.ngl.output_ngless
[Thu 03-12-2020 10:02:11]:  color setting: AutoColor
[Thu 03-12-2020 10:02:11]:  print header: True
[Thu 03-12-2020 10:02:11]:  subsample: False
[Thu 03-12-2020 10:02:11]:  verbosity: Normal
[Thu 03-12-2020 10:02:11]:  search path:
[Thu 03-12-2020 10:02:11]:      References=/data/ref/
[Thu 03-12-2020 10:02:11]: Loading modules...
[Thu 03-12-2020 10:02:11]: Validating script...
[Thu 03-12-2020 10:02:11]: Looking for file 'input.txt' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11]: Looking for file (input.txt) in input.txt
[Thu 03-12-2020 10:02:11]: Looking for file 'ref.fna' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11]: Looking for file (ref.fna) in ref.fna
[Thu 03-12-2020 10:02:11]: Writing to file 'output.tsv' will overwrite existing file.
[Thu 03-12-2020 10:02:11]: Writing to file 'compressed.tsv.gz' will overwrite existing file.
[Thu 03-12-2020 10:02:11]: Transforming script...
NGLess v1.2.0 (C) NGLess authors
https://ngless.embl.de/

When publishing results from this script, please cite the following references:

     - Coelho, L.P., Alves, R., Monteiro, P., Huerta-Cepas, J., Freitas, A.T., and Bork, P.,
     NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language. in
     Microbiome 7:84 (2019). DOI: http://doi.org/10.1186/s40168-019-0684-8

     - Li, H., 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv
     preprint arXiv:1303.3997.

[Thu 03-12-2020 10:02:11]: Script OK. Starting interpretation...
[Thu 03-12-2020 10:02:11] Line 8: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 8: Interpreting [interpretIO]: __check_count(__VOID; original_lno=8; features=["seqname"])
[Thu 03-12-2020 10:02:11] Line 17: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 17: Interpreting [interpretIO]: __check_count(__VOID; original_lno=17; features=["seqname"])
[Thu 03-12-2020 10:02:11] Line 4: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [interpretIO]: allsamples = readlines("input.txt")
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [assignment]: readlines("input.txt")
[Thu 03-12-2020 10:02:11] Line 4: Interpreting [executing module function: 'readlines']: NGOString "input.txt"
[Thu 03-12-2020 10:02:11] Line 5: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [interpretIO]: sample = lock1(Lookup 'allsamples' as NGList NGLString; __hash="29a4de8241eb49022b39cf4f32f0ee8c")
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [assignment]: lock1(Lookup 'allsamples' as NGList NGLString; __hash="29a4de8241eb49022b39cf4f32f0ee8c")
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [executing module function: 'lock1']: NGOList [NGOString "sample/sample.fq"]
[Thu 03-12-2020 10:02:11] Line 5: Looking for a lock in ngless-locks/29a4de82. Total number of elements is 1 (not locked: 1; not finished: 1).
[Thu 03-12-2020 10:02:11] Line 5: Acquired lock file ngless-locks/29a4de82/sample_sample.fq.lock
[Thu 03-12-2020 10:02:11] Line 5: lock1: Obtained lock file: 'ngless-locks/29a4de82/sample_sample.fq.lock'
[Thu 03-12-2020 10:02:11] Line 5: Writing stats to 'ngless-stats/29a4de82/sample_sample.fq'
[Thu 03-12-2020 10:02:11] Line 5: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [interpretIO]: __check_ifile(Lookup 'sample' as NGLString; original_lno=6)
[Thu 03-12-2020 10:02:11] Line 5: Interpreting [executing module function: '__check_ifile']: NGOString "sample/sample.fq"
[Thu 03-12-2020 10:02:11] Line 6: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [interpretIO]: __check_ifile(Lookup 'sample' as NGLString; original_lno=6)
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [executing module function: '__check_ifile']: NGOString "sample/sample.fq"
[Thu 03-12-2020 10:02:11] Line 6: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [interpretIO]: input = fastq(Lookup 'sample' as NGLString)
[Thu 03-12-2020 10:02:11] Line 6: Interpreting [assignment]: fastq(Lookup 'sample' as NGLString)
[Thu 03-12-2020 10:02:11] Line 6: Simple Statistics completed for: sample/sample.fq
[Thu 03-12-2020 10:02:11] Line 6: Number of base pairs: 584
[Thu 03-12-2020 10:02:11] Line 6: Encoding is: SangerEncoding
[Thu 03-12-2020 10:02:11] Line 6: Number of sequences: 2772
[Thu 03-12-2020 10:02:11] Line 7: Running garbage collection.
[Thu 03-12-2020 10:02:11] Line 7: Interpreting [interpretIO]: mapped = map(Lookup 'input' as NGLReadSet; fafile="ref.fna")
[Thu 03-12-2020 10:02:11] Line 7: Interpreting [assignment]: map(Lookup 'input' as NGLReadSet; fafile="ref.fna")
[Thu 03-12-2020 10:02:11] Line 7: Looking for file 'ref.fna' (search path is ["References=/data/ref/"])
[Thu 03-12-2020 10:02:11] Line 7: Looking for file (ref.fna) in ref.fna
[Thu 03-12-2020 10:02:11] Line 7: Index for ref.fna already exists.
[Thu 03-12-2020 10:02:11] Line 7: Created & opened temporary file /home/puller/ngless_vadim/tmp/mapped_ref.sam18674-2.zstd
[Thu 03-12-2020 10:02:11] Line 7: Starting mapping to ref.fna
[Thu 03-12-2020 10:02:11] Line 7: Will run process /opt/miniconda3/bin/../share/ngless/bin/ngless-1.2.0-bwa mem -t 1 -K 100000000 ref-bwa-0.7.17.fna -p -
[Thu 03-12-2020 10:02:12] Line 7: Stderr: [M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2772 sequences (643849 bp)...
[M::process] 2772 single-end sequences; 0 paired-end sequences
[M::mem_process_seqs] Processed 2772 reads in 0.301 CPU sec, 0.301 real sec
[main] Version: 0.7.17-r1188
[main] CMD: /opt/miniconda3/bin/../share/ngless/bin/ngless-1.2.0-bwa mem -t 1 -K 100000000 -p ref-bwa-0.7.17.fna -
[main] Real time: 0.343 sec; CPU: 0.321 sec

[Thu 03-12-2020 10:02:12] Line 7: Success
[Thu 03-12-2020 10:02:12] Line 7: Mapped readset stats (ref.fna):
[Thu 03-12-2020 10:02:12] Line 7: Total reads: 2772
[Thu 03-12-2020 10:02:12] Line 7: Total reads aligned: 551 [19.88%]
[Thu 03-12-2020 10:02:12] Line 7: Total reads Unique map: 545 [19.66%]
[Thu 03-12-2020 10:02:12] Line 7: Total reads Non-Unique map: 6 [0.22%]
[Thu 03-12-2020 10:02:12] Line 8: Running garbage collection.
[Thu 03-12-2020 10:02:12] Line 8: Interpreting [interpretIO]: counts = count(Lookup 'mapped' as NGLMappedReadSet; features=["seqname"])
[Thu 03-12-2020 10:02:12] Line 8: Interpreting [assignment]: count(Lookup 'mapped' as NGLMappedReadSet; features=["seqname"])
[Thu 03-12-2020 10:02:12] Line 8: Starting count...
[Thu 03-12-2020 10:02:12] Line 8: Loaded headers. Starting parsing/distribution.
[Thu 03-12-2020 10:02:12] Line 8: Counts (second pass)...
[Thu 03-12-2020 10:02:12] Line 8: Created & opened temporary file /home/puller/ngless_vadim/tmp/counts.mapped_ref18674-3.txt
[Thu 03-12-2020 10:02:12] Line 10: Running garbage collection.
[Thu 03-12-2020 10:02:12] Line 10: Interpreting [interpretIO]: collect(Lookup 'counts' as NGLCounts; __hash="8a1f8bb9f282748d0d48635fa5209289"; current=Lookup 'sample' as NGLString; allneeded=Lookup 'allsamples' as NGList NGLString; ofile="output.tsv"; auto_comments=[{script}])
[Thu 03-12-2020 10:02:12] Line 10: Interpreting [executing module function: 'collect']: NGOCounts File /home/puller/ngless_vadim/tmp/counts.mapped_ref18674-3.txt
[Thu 03-12-2020 10:02:12] Line 10: Created & opened temporary file /home/puller/ngless_vadim/tmp/partial.compress.tsv18674-5.gz
luispedro commented 3 years ago

Thanks for the report. I can confirm I see it and I will add a test case to the test suite capturing it

luispedro commented 3 years ago

Thanks for the bug report. This is fixed on the development version and will be included in the next release. In the meanwhile, I hope you can find a quick workaround using symlinks or similar