epi2me-labs / wf-transcriptomes

Other
64 stars 30 forks source link

[Bug]: Trying to run on docker, fails on mapping #6

Closed Dan2389 closed 1 year ago

Dan2389 commented 1 year ago

What happened?

Can anyone point me in the right direction? I am trying to run this workflow from windows using the docker. The analysis starts but then fails. I think i gave all the right inputs.

Operating System

Windows 11

Workflow Execution

EPI2ME Labs desktop application

Workflow Execution - EPI2ME Labs Versions

EPI2ME LABS 3.1.5; environment v1.2.6

Workflow Execution - Execution Profile

Docker

Workflow Version

0.1.5

Relevant log output

Checking fastq input.

Single directory input detected.

Doing reference based transcript analysis

[5d/8fa420] Submitted process > pipeline:getVersions

[7a/619780] Submitted process > pipeline:getParams

[f9/cd390c] Submitted process > pipeline:summariseConcatReads (1)

[c5/3373c9] Submitted process > start_ping:pingMessage (1)

[e3/eaddea] Submitted process > end_ping:pingMessage

[a3/9c3872] Submitted process > pipeline:build_minimap_index

[62/e25967] Submitted process > pipeline:preprocess_reads (1)

[fe/4c99b0] Submitted process > pipeline:reference_assembly:map_reads (1)

**Error executing process > 'pipeline:reference_assembly:map_reads (1)'**

**Caused by:

Process `pipeline:reference_assembly:map_reads (1)` terminated with an error exit status (1)**

Command executed:

minimap2 -t 4 -ax splice -uf genome_index.mmi flongle1_full_length_reads.fastq | samtools view -q 40 -F 2304 -Sb - | seqkit bam -j 4 -x -T 'AlnContext: { Ref: "Homo_sapiens.GRCh38.cdna.all.fa.gz", LeftShift: -24,

RightShift: 24, RegexEnd: "[Aa]{8,}",

Stranded: True,Invert: True, Tsv: "internal_priming_fail.tsv"} ' - | samtools sort -@ 4 -o "flongle1_reads_aln_sorted.bam" - ;

((cat "flongle1_reads_aln_sorted.bam" | seqkit bam -s -j 4 - 2>&1) | tee flongle1_read_aln_stats.tsv ) || true

if [[ -s "internal_priming_fail.tsv" ]];

then

tail -n +2 "internal_priming_fail.tsv" | awk '{print ">" $1 "\n" $4 }' - > "context_internal_priming_fail_start.fasta"

tail -n +2 "internal_priming_fail.tsv" | awk '{print ">" $1 "\n" $6 }' - > "context_internal_priming_fail_end.fasta"

fi

Command exit status:

1

Command output:

(empty)

Command error:

**[WARNING] Indexing parameters (-k, -w or -H) overridden by parameters used in the prebuilt index.**

[M::main::57.662*0.21] loaded/built the index for 207877 target sequence(s)

[M::mm_mapopt_update::57.857*0.21] mid_occ = 187

[M::mm_idx_stat] kmer size: 14; skip: 10; is_hpc: 0; #seq: 207877

[M::mm_idx_stat::57.966*0.21] distinct minimizers: 12328054 (29.91% are singletons); average occurrences: 5.746; average spacing: 5.442; total length: 385455277

[INFO] create FASTA index for Homo_sapiens.GRCh38.cdna.all.fa.gz

[ERRO] different line length in sequence: j��)jV;�zXT�6a��

P+hK��i�ر���:���'�m�`��6S�e���܂2O�EuX�;�w��%�

[E::bgzf_read_block] Failed to read BGZF block data at offset 2870666 expected 16395 bytes; hread returned 12900

[E::bgzf_read] Read block operation failed with error 4 after 1 of 4 bytes

[E::bam_hdr_read] Error reading BGZF stream

samtools sort: failed to read header from "-"

Work dir:

/mnt/wsl/docker-desktop-bind-mounts/Ubuntu/d688807fefa04ae1460fa2a774a1fcbb60dc46ec170353d4f920653078cda3aa/epi2melabs-data/nextflow/instances/2022-11-02-13-11_wf-transcriptomes_8ntQExDSKiXfevrFrBUDgv/work/fe/4c99b086f3f0e1275936221a471c3d

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

WARN: Input tuple does not match input set cardinality declared by process `pipeline:makeReport` -- offending value: [[]]
keenhl commented 1 year ago

I am running on Linux using the Conda environment, and am getting the same error. I'm going to try and use a different reference and see what happens.

keenhl commented 1 year ago

Update: I unzipped the reference file and the pipeline works now.