I tried salmon 1.2 and now also 1.3, but when running your tests salmon fails on me:
#/usr/bin/make -j2 check VERBOSE=1
./test.sh
Commencing snakemake run submission locally
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 collate_read_counts
6 count_reads
1 counts_from_SALMON
6 fastqc
6 genomeCoverage
6 index_bam
1 multiqc
1 norm_counts_deseq
1 report1
1 report2
1 report3
1 salmon_index
6 salmon_quant
6 sort_bam
1 star_index
6 star_map
1 translate_sample_sheet_for_report
4 trim_galore_pe
2 trim_galore_se
59
[Fri Jul 31 17:54:24 2020]
rule salmon_index:
input: /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/sample_data/sample.cdna.fasta
output: /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index/sa.bin
log: /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/logs/salmon_index.log
jobid: 2
/usr/bin/salmon index -t /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/sample_data/sample.cdna.fasta -i /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index -p 8 >> /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/logs/salmon_index.log 2>&1
Waiting at most 5 seconds for missing files.
MissingOutputException in line 320 of /home/moeller/pigx-rnaseq/pigx-rnaseq/pigx_rnaseq.py:
Job completed successfully, but some output files are missing. Missing files after 5 seconds:
/home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index/sa.bin
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
File "/usr/lib/python3/dist-packages/snakemake/executors/__init__.py", line 544, in handle_job_success
File "/usr/lib/python3/dist-packages/snakemake/executors/__init__.py", line 225, in handle_job_success
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/.snakemake/log/2020-07-31T175423.936343.snakemake.log
ERROR: could not find report for SALMON at transcript level
make[1]: *** [debian/rules:21: override_dh_auto_test] Fehler 1
I set the "jobs: 1" in tests/settings.yaml to avoid the extra complexity in my report.
The salmon logfile is a bit weird in that it only mentions step 1 or 4:
$ cat /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/logs/salmon_index.log
[2020-07-31 17:56:47.683] [jLog] [warning] The salmon index is being built without any decoy sequences. It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.
[2020-07-31 17:56:47.683] [jLog] [info] building index
out : /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index
[2020-07-31 17:56:47.683] [puff::index::jointLog] [info] Running fixFasta
[Step 1 of 4] : counting k-mers
[2020-07-31 17:56:47.990] [puff::index::jointLog] [warning] Removed 1 transcripts that were sequence duplicates of indexed transcripts.
[2020-07-31 17:56:47.990] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the `--keepDuplicates` flag
[2020-07-31 17:56:47.990] [puff::index::jointLog] [info] Replaced 0 non-ATCG nucleotides
[2020-07-31 17:56:47.990] [puff::index::jointLog] [info] Clipped poly-A tails from 17 transcripts
wrote 3654 cleaned references
[2020-07-31 17:56:48.015] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers
[2020-07-31 17:56:48.128] [puff::index::jointLog] [info] ntHll estimated 2236392 distinct k-mers, setting filter size to 2^26
Threads = 8
Vertex length = 31
Hash functions = 5
Filter size = 67108864
Capacity = 2
Files:
/home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index/ref_k31_fixed.fa
--------------------------------------------------------------------------------
Round 0, 0:67108864
Pass Filling Filtering
1 0 0
2 1 0
True junctions count = 12090
False junctions count = 7469
Hash table size = 19559
Candidate marks count = 84180
--------------------------------------------------------------------------------
Reallocating bifurcations time: 0
True marks count: 71140
Edges construction time: 0
--------------------------------------------------------------------------------
Distinct junctions = 12090
allowedIn: 15
Max Junction ID: 14549
seen.size():116401 kmerInfo.size():14550
approximateContigTotalLength: 1606116
counters for complex kmers:
(prec>1 & succ>1)=372 | (succ>1 & isStart)=6 | (prec>1 & isEnd)=14 | (isStart & isEnd)=3
contig count: 17787 element count: 2811269 complex nodes: 395
# of ones in rank vector: 17786
[2020-07-31 17:56:50.225] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file.
[2020-07-31 17:56:50.225] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index
size = 2811269
-----------------------------------------
| Loading contigs | Time = 552.52 us
-----------------------------------------
size = 2811269
-----------------------------------------
| Loading contig boundaries | Time = 259.58 us
-----------------------------------------
Number of ones: 17786
Number of ones per inventory item: 512
Inventory entries filled: 35
17786
[2020-07-31 17:56:50.238] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure.
[2020-07-31 17:56:50.239] [puff::index::jointLog] [info] contig count for validation: 17,786
[2020-07-31 17:56:50.248] [puff::index::jointLog] [info] Total # of Contigs : 17,786
[2020-07-31 17:56:50.248] [puff::index::jointLog] [info] Total # of numerical Contigs : 17,786
[2020-07-31 17:56:50.249] [puff::index::jointLog] [info] Total # of contig vec entries: 68,986
[2020-07-31 17:56:50.249] [puff::index::jointLog] [info] bits per offset entry 17
[2020-07-31 17:56:50.251] [puff::index::jointLog] [info] Done constructing the contig vector. 17787
[2020-07-31 17:56:50.259] [puff::index::jointLog] [info] # segments = 17,786
[2020-07-31 17:56:50.259] [puff::index::jointLog] [info] total length = 2,811,269
[2020-07-31 17:56:50.260] [puff::index::jointLog] [info] Reading the reference files ...
[2020-07-31 17:56:50.294] [puff::index::jointLog] [info] positional integer width = 22
[2020-07-31 17:56:50.294] [puff::index::jointLog] [info] seqSize = 2,811,269
[2020-07-31 17:56:50.294] [puff::index::jointLog] [info] rankSize = 2,811,269
[2020-07-31 17:56:50.294] [puff::index::jointLog] [info] edgeVecSize = 0
[2020-07-31 17:56:50.294] [puff::index::jointLog] [info] num keys = 2,277,689
[Building BooPHF] 100 % elapsed: 0 min 0 sec remaining: 0 min 0 sec
[2020-07-31 17:56:50.448] [puff::index::jointLog] [info] mphf size = 1.42339 MB
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk size = 351,409
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 0 = [0, 351,426)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 1 = [351,426, 702,835)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 2 = [702,835, 1,054,244)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 3 = [1,054,244, 1,405,674)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 4 = [1,405,674, 1,757,083)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 5 = [1,757,083, 2,108,492)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 6 = [2,108,492, 2,459,901)
[2020-07-31 17:56:50.449] [puff::index::jointLog] [info] chunk 7 = [2,459,901, 2,811,239)
[2020-07-31 17:56:50.539] [puff::index::jointLog] [info] finished populating pos vector
[2020-07-31 17:56:50.539] [puff::index::jointLog] [info] writing index components
[2020-07-31 17:56:50.554] [puff::index::jointLog] [info] finished writing dense pufferfish index
[2020-07-31 17:56:50.556] [jLog] [info] done building index
for info, total work write each : 2.331 total work inram from level 3 : 4.322 total work raw : 25.000
Bitarray 11940288 bits (100.00 %) (array + ranks )
final hash 0 bits (0.00 %) (nb in final hash 0)
[2020-07-31 17:56:55.854] [jLog] [warning] The salmon index is being built without any decoy sequences. It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.
[2020-07-31 17:56:55.855] [jLog] [info] building index
out : /home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index
[2020-07-31 17:56:55.855] [puff::index::jointLog] [info] Running fixFasta
[Step 1 of 4] : counting k-mers
[2020-07-31 17:56:56.245] [puff::index::jointLog] [warning] Removed 1 transcripts that were sequence duplicates of indexed transcripts.
[2020-07-31 17:56:56.245] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the `--keepDuplicates` flag
[2020-07-31 17:56:56.246] [puff::index::jointLog] [info] Replaced 0 non-ATCG nucleotides
[2020-07-31 17:56:56.246] [puff::index::jointLog] [info] Clipped poly-A tails from 17 transcripts
wrote 3654 cleaned references
[2020-07-31 17:56:56.270] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers
[2020-07-31 17:56:56.395] [puff::index::jointLog] [info] ntHll estimated 2236392 distinct k-mers, setting filter size to 2^26
Threads = 8
Vertex length = 31
Hash functions = 5
Filter size = 67108864
Capacity = 2
Files:
/home/moeller/pigx-rnaseq/pigx-rnaseq/tests/output/salmon_index/ref_k31_fixed.fa
--------------------------------------------------------------------------------
Round 0, 0:67108864
Pass Filling Filtering
The disk is not full.
I also executed the command manually but did not find a sa.bin file created. The salmon_index directory offers:
Any idea where I should look? salmon is the Debian package, not guix, admittedly. This failure is what blocks the Debian/Ubuntu package of pigx-rnaseq.
Hello,
I tried salmon 1.2 and now also 1.3, but when running your tests salmon fails on me:
I set the "jobs: 1" in tests/settings.yaml to avoid the extra complexity in my report. The salmon logfile is a bit weird in that it only mentions step 1 or 4:
The disk is not full.
I also executed the command manually but did not find a sa.bin file created. The salmon_index directory offers:
Any idea where I should look? salmon is the Debian package, not guix, admittedly. This failure is what blocks the Debian/Ubuntu package of pigx-rnaseq.
Many thanks for your help! Steffen