liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
283 stars 49 forks source link

Fails with uninformative error #98

Open jakevc opened 3 years ago

jakevc commented 3 years ago

When running with the command:

./TRUST4/run-trust4 --od ./trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/*_R2_*.fastq.gz --barcode data/*_R1_*.fastq.gz --barcodeRange 0 15 + --UMI data/*_R1_*.fastq.gz --umiRange 16 25 + --repseq --barcodeWhitelist refs/737K-august-2016.txt

Fails with output:

[Wed Nov 17 23:39:31 2021] TRUST4 begins.
[Wed Nov 17 23:39:31 2021] SYSTEM CALL: TRUST4/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz -u data/B-A1_S6_L002_R2_001.fastq.gz -u data/B-B1_S4_L001_R2_001.fastq.gz -u data/B-B1_S4_L002_R2_001.fastq.gz -u data/B-M1_S2_L001_R2_001.fastq.gz -u data/B-M1_S2_L002_R2_001.fastq.gz -u data/T-A1_S5_L001_R2_001.fastq.gz -u data/T-A1_S5_L002_R2_001.fastq.gz -u data/T-M1_S1_L001_R2_001.fastq.gz -u data/T-M1_S1_L002_R2_001.fastq.gz -u data/T-T1_S3_L001_R2_001.fastq.gz -u data/T-T1_S3_L002_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --barcode data/B-A1_S6_L002_R1_001.fastq.gz --barcode data/B-B1_S4_L001_R1_001.fastq.gz --barcode data/B-B1_S4_L002_R1_001.fastq.gz --barcode data/B-M1_S2_L001_R1_001.fastq.gz --barcode data/B-M1_S2_L002_R1_001.fastq.gz --barcode data/T-A1_S5_L001_R1_001.fastq.gz --barcode data/T-A1_S5_L002_R1_001.fastq.gz --barcode data/T-M1_S1_L001_R1_001.fastq.gz --barcode data/T-M1_S1_L002_R1_001.fastq.gz --barcode data/T-T1_S3_L001_R1_001.fastq.gz --barcode data/T-T1_S3_L002_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L002_R1_001.fastq.gz --UMI data/B-B1_S4_L001_R1_001.fastq.gz --UMI data/B-B1_S4_L002_R1_001.fastq.gz --UMI data/B-M1_S2_L001_R1_001.fastq.gz --UMI data/B-M1_S2_L002_R1_001.fastq.gz --UMI data/T-A1_S5_L001_R1_001.fastq.gz --UMI data/T-A1_S5_L002_R1_001.fastq.gz --UMI data/T-M1_S1_L001_R1_001.fastq.gz --UMI data/T-M1_S1_L002_R1_001.fastq.gz --UMI data/T-T1_S3_L001_R1_001.fastq.gz --UMI data/T-T1_S3_L002_R1_001.fastq.gz
[Wed Nov 17 23:39:31 2021] Start to extract candidate reads from read files.
Unknown parameter (null)
system TRUST4/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz -u data/B-A1_S6_L002_R2_001.fastq.gz -u data/B-B1_S4_L001_R2_001.fastq.gz -u data/B-B1_S4_L002_R2_001.fastq.gz -u data/B-M1_S2_L001_R2_001.fastq.gz -u data/B-M1_S2_L002_R2_001.fastq.gz -u data/T-A1_S5_L001_R2_001.fastq.gz -u data/T-A1_S5_L002_R2_001.fastq.gz -u data/T-M1_S1_L001_R2_001.fastq.gz -u data/T-M1_S1_L002_R2_001.fastq.gz -u data/T-T1_S3_L001_R2_001.fastq.gz -u data/T-T1_S3_L002_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --barcode data/B-A1_S6_L002_R1_001.fastq.gz --barcode data/B-B1_S4_L001_R1_001.fastq.gz --barcode data/B-B1_S4_L002_R1_001.fastq.gz --barcode data/B-M1_S2_L001_R1_001.fastq.gz --barcode data/B-M1_S2_L002_R1_001.fastq.gz --barcode data/T-A1_S5_L001_R1_001.fastq.gz --barcode data/T-A1_S5_L002_R1_001.fastq.gz --barcode data/T-M1_S1_L001_R1_001.fastq.gz --barcode data/T-M1_S1_L002_R1_001.fastq.gz --barcode data/T-T1_S3_L001_R1_001.fastq.gz --barcode data/T-T1_S3_L002_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L002_R1_001.fastq.gz --UMI data/B-B1_S4_L001_R1_001.fastq.gz --UMI data/B-B1_S4_L002_R1_001.fastq.gz --UMI data/B-M1_S2_L001_R1_001.fastq.gz --UMI data/B-M1_S2_L002_R1_001.fastq.gz --UMI data/T-A1_S5_L001_R1_001.fastq.gz --UMI data/T-A1_S5_L002_R1_001.fastq.gz --UMI data/T-M1_S1_L001_R1_001.fastq.gz --UMI data/T-M1_S1_L002_R1_001.fastq.gz --UMI data/T-T1_S3_L001_R1_001.fastq.gz --UMI data/T-T1_S3_L002_R1_001.fastq.gz failed: 256 at ./TRUST4/run-trust4 line 48.

I complied from source, i.e. git pull & make.

mourisl commented 3 years ago

This is very strange, the "Unknown parameter" error message should happen before the output of "Start to extract candidate reads...". The command looks fine to me. Could you please run TRUST4 with just one lane to check whether the error message is the same? Thank you.

jakevc commented 3 years ago

Interesting, yeah appears to be the same error with just one

$ ./TRUST4/run-trust4 --od ./foxlab-trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --barcodeRange 0 15 + --UMI data/B-A1_S6_L001_R1_001.fastq.gz --umiRange 16 25 + --repseq --barcodeWhitelist refs/737K-august-2016.txt
[Thu Nov 18 05:01:50 2021] TRUST4 begins.
[Thu Nov 18 05:01:50 2021] SYSTEM CALL: /home/jake.vancampenprovidence.org/yoshi-trust4/TRUST4/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./foxlab-trust4/run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz
[Thu Nov 18 05:01:50 2021] Start to extract candidate reads from read files.
Unknown parameter (null)
system /home/jake.vancampenprovidence.org/yoshi-trust4/TRUST4/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./foxlab-trust4/run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz failed: 256 at ./TRUST4/run-trust4 line 48.
mourisl commented 3 years ago

Can you run it as a bulk data, i.e: ./TRUST4/run-trust4 --od ./foxlab-trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/B-A1_S6_L001_R2_001.fastq.gz --repseq

Just want to confirm, can you use git clone and make to compile TRUST4 in another directory for a clean compilation?

jakevc commented 3 years ago

Yup that works

$ trust4-clean/run-trust4 -t 8 --od ./foxlab-trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/B-A1_S6_L001_R2_001.fastq.gz --repseq
[Thu Nov 18 18:18:42 2021] TRUST4 begins.
[Thu Nov 18 18:18:42 2021] SYSTEM CALL: /yoshi-trust4/trust4-clean/fastq-extractor -t 8 -f refs/hg38_bcrtcr.fa -o ./foxlab-trust4/run1_toassemble  -u data/B-A1_S6_L001_R2_001.fastq.gz
[Thu Nov 18 18:18:42 2021] Start to extract candidate reads from read files.
[Thu Nov 18 18:19:27 2021] Finish extracting reads.
[Thu Nov 18 18:19:27 2021] SYSTEM CALL: /yoshi-trust4/trust4-clean/trust4  -t 8 -f refs/hg38_bcrtcr.fa --trimLevel 2 --skipMateExtension -o ./foxlab-trust4/run1 -u ./foxlab-trust4/run1_toassemble.fq
[Thu Nov 18 18:19:27 2021] Found 2346 reads.
[Thu Nov 18 18:19:27 2021] Finish sorting the reads.
[Thu Nov 18 18:19:27 2021] Finish rough annotations.
[Thu Nov 18 18:19:27 2021] Assembled 41 reads.
[Thu Nov 18 18:19:27 2021] Try to rescue 17 reads for assembly.
[Thu Nov 18 18:19:27 2021] Rescued 0 reads.
[Thu Nov 18 18:19:27 2021] SYSTEM CALL: /yoshi-trust4/trust4-clean/annotator -f refs/human_IMGT+C.fa -a ./foxlab-trust4/run1_final.out -t 8 -o ./foxlab-trust4/run1  -r ./foxlab-trust4/run1_assembled_reads.fa > ./foxlab-trust4/run1_annot.fa
[Thu Nov 18 18:19:27 2021] Start to annotate assemblies.
[Thu Nov 18 18:19:27 2021] Start to realign reads for CDR3 analysis.
[Thu Nov 18 18:19:27 2021] Compute CDR3 abundance.
[Thu Nov 18 18:19:27 2021] Finish annotation.
[Thu Nov 18 18:19:27 2021] SYSTEM CALL: perl /yoshi-trust4/trust4-clean/trust-simplerep.pl ./foxlab-trust4/run1_cdr3.out  > ./foxlab-trust4/run1_report.tsv
[Thu Nov 18 18:19:27 2021] SYSTEM CALL: perl /yoshi-trust4/trust4-clean/trust-airr.pl ./foxlab-trust4/run1_cdr3.out ./foxlab-trust4/run1_annot.fa > ./foxlab-trust4/run1_airr.tsv
[Thu Nov 18 18:19:28 2021] TRUST4 finishes.
mourisl commented 3 years ago

How about running the full data (with barcode, UMI) using trust4-clean? Thank you.

jakevc commented 3 years ago

same same

$ trust4-clean/run-trust4 --od ./foxlab-trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --barcodeRange 0 15 + --UMI data/B-A1_S6_L001_R1_001.fastq.gz --umiRange 16 25 + --repseq --barcodeWhitelist refs/737K-august-2016.txt
[Thu Nov 18 18:25:46 2021] TRUST4 begins.
[Thu Nov 18 18:25:46 2021] SYSTEM CALL: /home/jake.vancampenprovidence.org/yoshi-trust4/trust4-clean/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./foxlab-trust4/run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz
[Thu Nov 18 18:25:46 2021] Start to extract candidate reads from read files.
Unknown parameter (null)
system /home/jake.vancampenprovidence.org/yoshi-trust4/trust4-clean/fastq-extractor -t 1 -f refs/hg38_bcrtcr.fa -o ./foxlab-trust4/run1_toassemble  --barcodeStart 0 --barcodeEnd 15 --umiStart 16 --umiEnd 25 --barcodeWhitelist refs/737K-august-2016.txt -u data/B-A1_S6_L001_R2_001.fastq.gz --barcode data/B-A1_S6_L001_R1_001.fastq.gz --UMI data/B-A1_S6_L001_R1_001.fastq.gz failed: 256 at trust4-clean/run-trust4 line 48.
jakevc commented 3 years ago

It's also worth noting that I got a segfault with the conda install, but I can file that as a separate issue

mourisl commented 3 years ago

I tried several different ways but still could not reproduce this error... what system did you use, and what is your gcc version?

Could you please try to run it with only barcode but without whitelist and umi, and then try barcode+whitelist, combinations? Sorry for the trouble.

jakevc commented 3 years ago
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="21.04 (Hirsute Hippo)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 21.04"
VERSION_ID="21.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=hirsute
UBUNTU_CODENAME=hirsute
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10.3.0-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1)
jakevc commented 3 years ago

Ahh there we are, okay so without --barcodeWhiteList we are cookin with heat.

trust4-clean/run-trust4 -t 8 --od ./foxlab-trust4 -o run1 -f refs/hg38_bcrtcr.fa --ref refs/human_IMGT+C.fa -u data/*_R2_*.fastq.gz --barcode data/*_R1_*.fastq.gz --barcodeRange 0 15 + --UMI data/*_R1_*.fastq.gz --umiRange 16 25 + --repseq

That command is running.

jakevc commented 2 years ago

Just kidding that eventually failed also with:

system /trust4-clean/trust4  -f refs/hg38_bcrtcr.fa --trimLevel 2 --skipMateExtension -t 16 -o ./foxlab-trust4/run1 -u ./foxlab-trust4/run1_toassemble.fq --barcode ./foxlab-trust4/run1_toassemble_bc.fa --UMI ./foxlab-trust4/run1_toassemble_umi.fa failed: 9 at trust4-clean/run-trust4 line 48.
mourisl commented 2 years ago

I think this is related to #74 . I just found a Ubuntu machine, though TRUST4 did not crash there, but had some errors in memory accessing. I think this could be due to some compiler optimization. Could you please change the second row of the Makefile from "-O3" to "-O", run "make clean; make" and then test it again? Thank you.

If it is indeed the optimization issue, it could take some time to fix.

jakevc commented 2 years ago

I did recompile with -O, then switched to a machine with more memory and the run finished. After observing the memory usage during runtime, it appears the machine didn't have enough memory to handle all my fastqs. Anyhow, if there was some way to return an informative error when the program runs out of memory that would be useful for future users. Thanks!

snowGoose-Chen commented 2 years ago

I had the same error "Unknown parameter (null)" error before the output of "Start to extract candidate reads..." when using TRUST4 installed from conda. But when I used the TRUST4 cloned and compiled from GitHub, the problem was magically solved. I ran the same command in these two situations. I don't understand what happened.

mourisl commented 2 years ago

The TRUST4 conda is still the previous version. Let me create a new TRUST4 version, and conda will have the update in a few days.