novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
110 stars 31 forks source link

Do you have non-multiprocess version? #89

Closed Raarbiarsan1899 closed 3 years ago

Raarbiarsan1899 commented 3 years ago

As I saw in several posts in this repo there are many people facing issues with multiprocessing. Do you have a non-multiprocessing version so that we can at least run it? Really appreciate in advance.

Raarbiarsan1899 commented 3 years ago

I think it is the problem with generator, here is how I solve multiprocessing issue with Epinano_Variants.py, may be partially.

Turn generator into comprehension list (3 lines)

from: tsv_gen.append (stdin_stdout_gen (cmd.stdout)) tsv_gen.append (stdin_stdout_gen(run1.stdout)) tsv_gen.append (stdin_stdout_gen(run2.stdout))

to: tsv_gen.append ([_line for _line in stdin_stdout_gen (cmd.stdout)]) tsv_gen.append ([_line for _line in stdin_stdout_gen(run1.stdout)]) tsv_gen.append ([_line for _line in stdin_stdout_gen(run2.stdout)])

In split_tsv_for_per_site_var_freq, move codes into try, and replace next(tsv) with list subset:

move: head = next(tsv) firstline = next (tsv) current_rd = firstline.split()[0] rd_cnt = 1 idx = 0 outfn = "{}/CHUNK{}.txt".format(folder, idx) out_fh = open (out_fn,'w') print (firstline.rstrip(), file=out_fh)

into try: for line in tsv: head = tsv[0] #next(tsv) firstline = tsv[1] # next (tsv) current_rd = firstline.split()[0] rd_cnt = 1 idx = 0 outfn = "{}/CHUNK{}.txt".format(folder, idx) out_fh = open (out_fn,'w') print (firstline.rstrip(), file=out_fh) for line in tsv[2:]:

A typo here I think, or the sam2tsv version is different: (3 lines) from: cmd = f"samtools view -h -F 3860 {bam_file} | java -jar {sam2tsv} -r {reference_file} "\ to: cmd = f"samtools view -h -F 3860 {bam_file} | java -jar {sam2tsv} -R {reference_file} " \

Huanle commented 3 years ago

Hi @Raarbiarsan1899 ,

What specific problem have you come across when running epinano_variants? Regarding sam2tsv, are you using the sam2tsv.jar included in the repo? I will release a new version without using sam2tsv asap. Thanks.

Raarbiarsan1899 commented 3 years ago

The sam2tsv I used have an additional column, causing del frequency miscalculated. I used the sam2tsv.jar included in the repo and it works now.

The multiprocessing issue was solved as I mentioned before.

Really appreciate your help!