novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
110 stars 31 forks source link

Epinano_Current.sh script not running #64

Closed aman21392 closed 3 years ago

aman21392 commented 4 years ago

sh /home/aclab/apps/EpiNano/Epinano_Current.sh -b E6.bam -r /Drive8/E6/fastq_pass/combined_T.fastq -f /Drive4/Homo_cdna.fa -d /Drive8/E6/fast5_pass/ -t 70 -m t /home/aclab/apps/EpiNano/Epinano_Current.sh: 21: /home/aclab/apps/EpiNano/Epinano_Current.sh: [[: not found /home/aclab/apps/EpiNano/Epinano_Current.sh: 49: /home/aclab/apps/EpiNano/Epinano_Current.sh: Bad substitution

Please can you tell me why is it not running. Thanks in advance

Huanle commented 4 years ago

Hi @aman21392 ,

can you try bash /home/aclab/apps/EpiNano/Epinano_Current.sh

or

chmod +x /home/aclab/apps/EpiNano/Epinano_Current.sh
/home/aclab/apps/EpiNano/Epinano_Current.sh

Let me know if either of the above works for you?

aman21392 commented 4 years ago

bash /home/aclab/apps/EpiNano/Epinano_Current.sh -b E6.bam -r /Drive8/E6_T.fastq -f /Drive4/Homo_cdna.fa -d /Drive8/E6/fast5_pass/ -t 70 -m t

Its work for me but now give other errors ---

[readdb] indexing /Drive8/E6/fast5_pass/ [readdb] num reads: 380492, num reads with path to fast5: 380492 [post-run summary] total reads: 35965, unparseable: 0, qc fail: 434, could not calibrate: 752, no alignment: 217, bad fast5: 0 File "/home/aclab/apps/EpiNano/misc/Epinano_Current.py", line 53 smallfile = f"{tmp_dir}/{idx}.chunk" ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/concat_events.py", line 17 print (header, file = outfh) ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/Slide_Intensity.py", line 59 print (kmer+','+l, file=outfh) ^ SyntaxError: invalid syntax

Please suggest me to sort out this problem. Thanks in advance

Huanle commented 4 years ago

Hi @aman21392 , What output have you got so far?

aman21392 commented 4 years ago

Till now i have following output- E6_.bam
E6.eventalign.tsv.gz
E6.per.read.var.csv
E6_wildtype.tsv E6.bam.bai
E6.per.read.var.5mer.csv
E6_strand.per.site.var.csv Thanks in advance

Huanle commented 4 years ago

Hi @aman21392 ,

Thanks. I would like to know what Epinano_Current.sh has produced. Can you check and let me know?

aman21392 commented 4 years ago

bash /home/aclab/apps/EpiNano/Epinano_Current.sh -b E6.bam -r /Drive8/fastq_pass/E6_T.fastq -f /Drive4/Homo_cdna.fa -d /Drive8/E6/fast5_pass/ -t 70 -m t Error running faidx_build on /Drive8/E6/fastq_pass/E6_T.fastq.index [W::bgzf_read_block] EOF marker is absent. The input is probably truncated. nanopolish: bgzf.c:1883: bgzf_useek: Assertion `fp->block_offset <= fp->block_length' failed. File "/home/aclab/apps/EpiNano/misc/Epinano_Current.py", line 53 smallfile = f"{tmp_dir}/{idx}.chunk" ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/concat_events.py", line 17 print (header, file = outfh) ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/Slide_Intensity.py", line 59 print (kmer+','+l, file=outfh) ^ SyntaxError: invalid syntax

There is no output from this. Thanks

Huanle commented 4 years ago

Hi @aman21392 ,

This isassociated with nanopolish. Can you check this issue and see if you are in the same situation? https://github.com/jts/nanopolish/issues/271

aman21392 commented 4 years ago

hi, Now everything is fine with nanopolish all files are made till now i just tell you--- E6_wildtype.bam E6_wildtype.eventalign.tsv.gz E6_wildtype.per.read.var.csv E6_wildtype.tsv E6_wildtype.bam.bai E6_wildtype.per.read.var.5mer.csv
E6_wildtype.plus_strand.per.site.var.csv

[readdb] indexing /Drive8/E6/fast5_pass/ [readdb] num reads: 380492, num reads with path to fast5: 380492 [post-run summary] total reads: 35992, unparseable: 0, qc fail: 434, could not calibrate: 752, no alignment: 217, bad fast5: 0 File "/home/aclab/apps/EpiNano/misc/Epinano_Current.py", line 53 smallfile = f"{tmp_dir}/{idx}.chunk" ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/concat_events.py", line 17 print (header, file = outfh) ^ SyntaxError: invalid syntax File "/home/aclab/apps/EpiNano/misc/Slide_Intensity.py", line 59 print (kmer+','+l, file=outfh) ^ SyntaxError: invalid syntax

But now is it another problem in your script i think .Can you please sort out this problem. Thanks in advance

Huanle commented 4 years ago

can you try the following and then let me know the outputs?

samtools view -F3860 E6.bam | cut -f1 > E6.forward.reads
python /path/to/misc/eventalign_strandedness.py E6.forward.reads E6_wildtype.eventalign.tsv
aman21392 commented 4 years ago

Hi huanle, Here is the output of your command which you gave me-
1.4G Oct 26 09:27 E6_wildtype.eventalign.tsv.gz.forward_strand.gz 6.6M Oct 26 09:27 E6_wildtype.eventalign.tsv.gz.reverse_strand.gz 157K Oct 26 09:18 E6.forward.reads 1.4G Oct 23 09:47 E6_wildtype.eventalign.tsv.gz 345M Oct 22 20:07 E6_wildtype.per.read.var.5mer.csv 176M Oct 22 20:06 E6_wildtype.per.read.var.csv 80M Oct 22 20:06 E6_wildtype.plus_strand.per.site.var.csv 178M Oct 22 19:19 E6_wildtype.tsv 2.3M Oct 22 19:17 E6_wildtype.bam.bai 322M Oct 22 19:16 E6_wildtype.bam

Thanks in advance

Huanle commented 4 years ago

The commands are from the shell script. So far, it goes well. To find exactly whats going wrong, we can only proceed with commands after the above ones. Now see what you can get with commads below:

python /path/to/misc/Epinano_Current.py --infile E6_wildtype.forward_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.forward  --threads 6 --strand + 
python /path/to/misc/Epinano_Current.py --infile E6_wildtype.reverse_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.reverse  --threads 6 --strand -

python /path/to/misc/concat_events.py E6_wildtype.forward_events.collapsed
python /path/to/misc/concat_events.py E6_wildtype.reverse_events.collapsed
aman21392 commented 3 years ago

1- nohup python3 /home/aclab/apps/EpiNano/misc/Epinano_Current.py --infile /Drive8/E6_wildtype.eventalign.tsv.gz.forward_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.forward --threads 70 --strand - & It running from 1 days and nothing change in file size and no error shows. It takes time like this or something got wrong in this :-- total 3.9G -rw-rw-r-- 1 aclab aclab 1.1G Oct 29 10:33 0.chunk -rw-rw-r-- 1 aclab aclab 957M Oct 29 10:33 1.chunk -rw-rw-r-- 1 aclab aclab 968M Oct 29 10:34 2.chunk -rw-rw-r-- 1 aclab aclab 993M Oct 29 10:34 3.chunk -rw-rw-r-- 1 aclab aclab 7.6M Oct 29 10:34 4.chunk As you see i run this command and file are made one day before but it still running. Is it everything fine or not .

2- nohup python3 /home/aclab/apps/EpiNano/misc/Epinano_Current.py --infile /Drive8/E6_wildtype.eventalign.tsv.gz.reverse_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.reverse --threads 70 --strand - &

total 20M -rw-rw-r-- 1 aclab aclab 20M Oct 29 10:59 0.chunk

Huanle commented 3 years ago

Hi @aman21392 , So it seems only the - strand file processing is lingering on, right? Can you send me you input files for me to look at?

btw, for the forward strand file, you should use --strand + .

Thanks.

aman21392 commented 3 years ago

no it running both forward and reverse strand . you want input file means E6_wildtype.eventalign.tsv.gz.reverse_strand.gz, E6_wildtype.eventalign.tsv.gz.forward_strand.gz these file you want or something else .where i send you the input file the forward_strand is 1gb and revervse_strand is small around 6mb. for the forward strand file, you should use --strand + ---- by mistake i written - symbol. I use --strand + symbol for forward strand file. Thanks in advance

aman21392 commented 3 years ago

E6_wildtype.eventalign.tsv.gz.reverse_strand.gz

Can you tell me how much average time it takes to complete this command Epinano_Current.py

python /path/to/misc/Epinano_Current.py --infile E6_wildtype.forward_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.forward --threads 6 --strand +

Huanle commented 3 years ago

Hi @aman21392 ,

Sorry for the late reply. I did not benchmark it in terms of running time. In fact, my major concern was with RAM consumption. I managed to reduce RAM cost without considering too much on speed as i believe it wont be very slow in the end, esp. when using multiple cpus.

Have you finished running it? 1gb is really small and should not take long.

aman21392 commented 3 years ago

Sorry but till now i did not finished the run. I stucked on this step because every time it running but not complete so i am just waiting to hear from you to give me suggestion to complete this run. nohup python3 /path/to/misc/Epinano_Current.py --infile E6_wildtype.forward_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.forward --threads 6 --strand + &

nohup python3 /home/aclab/apps/EpiNano/misc/Epinano_Current.py --infile /Drive8/E6_wildtype.eventalign.tsv.gz.reverse_strand.gz --reference /Drive4/Homo_cdna.fa --outdir E6_wildtype.reverse --threads 70 --strand - &

It keep running from days so i am wondering am i doing wrong anything in the command.

Thanks in advance for your suggestion.

Huanle commented 3 years ago

Hi @aman21392 , Can you share with me some of your data, aka, the current intensity and the reference file. you can grep 100 reads from your current intensity results. Thanks a lot.

aman21392 commented 3 years ago

Sorry for late reply. i think i can not send you here because it is big file i can't send here so can you please tell me where i can send you. Please provide details. Thanks

Huanle commented 3 years ago

Hi @aman21392 , I know it's usually big. So you can grep details associated with a few reads to me for me to chek the format. If that is still beyond the limit of github, then please share with me a couple of reads.

aman21392 commented 3 years ago

fil.txt I used a human cDNA.fa file download from ensembl.

Sorry for the delay. so I send you some of the fastq reads. Anything else you want. Thanks in advance to help me to sort out this issue.

aman21392 commented 3 years ago

fil.txt I send more fastq reads. Maybe you can use these reads. Thanks

Huanle commented 3 years ago

Hi @aman21392 ,

I was requesting event align results associated with a few reads to have a look.

aman21392 commented 3 years ago

send.txt send_fastq.txt There are 2 files in which the send.txt is eventalign file and send.fastq.txt contains reads with the sequence present in the eventalign files. Thanks in advance for looking at these files to sort out my issue.