gersteinlab / texp

TeXP is a pipeline to gauge the autonomous transcription level of L1 subfamilies using short read RNA-seq data
Apache License 2.0
5 stars 1 forks source link

Error: Segmentation fault #3

Closed WeichenZhou closed 4 years ago

WeichenZhou commented 5 years ago

Hi there,

I was running TeXP using the example in the manual. Basically, I install all the dependencies by Conda on our computing cluster. After submitting the job, it encountered some error information:

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
/home/arthurz/anaconda3/envs/texp2/bin/intersectBed: line 2: 18701 Segmentation fault      (core dumped) ${0%/*}/bedtools intersect "$@"
make: *** [/home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.re.filtered.bed] Error 139

Basically, it's a segmentation fault. And the quick_texp_run.log file reads:

2019-07-01(15:22:49) TeXP: Created results dir: /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run

======================

2019-07-01(15:22:50) TeXP: Guessing read legth based on fastq sequences:

2019-07-01(15:22:50) TeXP: Finished guessing read legth based on fastq sequences:

======================

2019-07-01(15:22:50) TeXP: Guessing encoding of fastq read-qualities:

2019-07-01(15:22:50) TeXP: gunzip -c /home/arthurz/app/texp/file.fastq.gz  | head -n 400000 | awk '{if(NR%4==0) printf(%s,/bin/sh);}' | od -A n -t u1 | awk 'BEGIN{min=100;max=0;}{for(i=1;i<=NF;i++) {if(>max) max=; if(<min) min=;}}END{if(max<=74 && min<59) print 33; else if(max>73 && min>=64) print 64; else if(min>=59 && min<64 && max>73) print 64; else print 64;}' > /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.qualityEncoding

2019-07-01(15:22:53) TeXP: Finished guessing encoding of fastq read-qualities:

======================

2019-07-01(15:22:53) TeXP: Filtering reads by base quality:

2019-07-01(15:22:53) TeXP: gunzip -c /home/arthurz/app/texp/file.fastq.gz  | awk '{line+=1; if ( (line+2) % 4 == 0) { gsub(/\./,N); print /bin/sh} else {print}}' | /home/arthurz/anaconda3/envs/texp2/bin/fastq_quality_filter -v -Q64 -p 80 -q 20 > /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.filtered.fastq

Quality cut-off: 20
Minimum percentage: 80
Input: 113588758 reads.
Output: 90010486 reads.
discarded 23578272 (20%) low-quality reads.
2019-07-01(15:54:24) TeXP: Finished filtering reads by base quality

======================

2019-07-01(15:54:24) TeXP: Mapping reads to a reference genome:

2019-07-01(15:54:24) TeXP: /home/arthurz/anaconda3/envs/texp2/bin/bowtie2 -p 1 --sensitive-local -N1 --no-unal -x /home/arthurz/app/data/library/bowtie2/bowtie2/hg38 -U /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.filtered.fastq 2>> /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run.log | /home/arthurz/anaconda3/envs/texp2/bin/samtools view -Sb - 2>> /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run.log > /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.bam; /home/arthurz/anaconda3/envs/texp2/bin/samtools sort -@1 /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.bam -o /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.sorted.bam; rm -R /home/arthurz/arthur_remflux_scratch/19.06.28.RNAseq/19.06.28.texp/process/example/quick_texp_run/quick_texp_run.bam; 
(ERR): "/home/arthurz/app/data/library/bowtie2/bowtie2/hg38" does not exist or is not a Bowtie 2 index
Exiting now ...
2019-07-01(15:54:24) TeXP: Indexing bam file:

2019-07-01(15:54:24) TeXP: Counting total number of mapped reads:

======================

2019-07-01(15:54:25) TeXP: Intersecting reads with repeat masked regions:

Do you have an insight into this issue? I would appreciate if you could give any suggestions.

Arthur

fabiocpn commented 5 years ago

Dear Arthur,

I noticed the following lines at your LOG file:

2019-07-01(15:54:24) TeXP: [...] (ERR): "/home/arthurz/app/data/library/bowtie2/bowtie2/hg38" does not exist or is not a Bowtie 2 index Exiting now ...

I think bedtools might be segfaulting because the bam file is actually empty or just contains a header. could you try to fix the path to the bowtie2 index and see if it works?

Also, I could add some extra conditions and error messages if that's really the case.

fabiocpn commented 4 years ago

Closing it for now, I'm happy to reopen this issue if we hear back from it

WeichenZhou commented 4 years ago

Thanks! I changed the EXT_LIBRARY_PATH in opts.mk file which I previously did wrong. Appreciate it!