CSB5 / INC-Seq

INC-Seq: Accurate single molecule reads using nanopore sequencing
Other
13 stars 4 forks source link

Error using pbdagcon #4

Closed JaLuxe closed 5 years ago

JaLuxe commented 7 years ago

I am currently testing you software and ran into a small problem:

error_161122_2

I tried to perform the task manually from the terminal which resulted in the following output: error_161122_3

If i add spaces between the optional inputs and their respective values, pbdagcon works just fine as can be seen in the last picture. error_161122_4

is there anything i can do to get around this problem?

Best regards, JaLuxe

JaLuxe commented 7 years ago

changed it in my local "utils/buildConsensus.py" and it works like a charm.

edit: As i did not commit my changes, i will leave this issue open until somebody implements a fix in an update.

best regards, JaLuxe

lch14forever commented 7 years ago

Hi JaLuxe,

It seems you are using an older version of INC-Seq and a different version of PBDAGCON from me. I think the PBDAGCON version on our server was quite old (git commit #de1cf85) and I did not have problem with not including spaces for its arguments.

In a recent fix, I have included the compiled PBDAGCON binary used by us. In addition, I have implemented a function to avoid some non-terminating bug in PBDAGCON (I also added spaces between arguments and options).

def pbdagcon(m5, t):
    script_dir = os.path.dirname(os.path.realpath(__file__))
    cmd = ("%s/pbdagcon -t %d -c 1 -m 1  %s" % (script_dir, t, m5)).split()

    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    ## if in 5 sec, pbdagcon does not finish, trim 1 base and recursively run it
    poll_seconds = 0.25
    deadline = time.time() + 5
    while time.time() < deadline and proc.poll() == None:
        time.sleep(poll_seconds)

    if proc.poll() == None:
        proc.terminate()
        sys.stderr.write("Warning: PBDAGCON timeout! Trimming %d base(s).\n" %(t+1))

    stdout, stderr = proc.communicate()
    if proc.returncode != 0:
        stdout = pbdagcon(m5, t+1)
    return stdout

You can try to pull the latest changes, and this version uses only the PBDAGCON binary inside the "utils" folder.

Chenhao.

JaLuxe commented 7 years ago

Hi Chenhao,

First of all, thank your for replying so fast.

I have now freshly cloned the repository. Unfortunately, now another error occurs. The program call again is performed with the sample input in the data-folder. I only pasted the last few lines As the previous hundreds were identical.

File "/home/bio/Tools/INC-Seq/INC-Seq/utils/buildConsensus.py", line 220, in pbdagcon stdout = pbdagcon(m5, t+1) File "/home/bio/Tools/INC-Seq/INC-Seq/utils/buildConsensus.py", line 220, in pbdagcon stdout = pbdagcon(m5, t+1) File "/home/bio/Tools/INC-Seq/INC-Seq/utils/buildConsensus.py", line 220, in pbdagcon stdout = pbdagcon(m5, t+1) File "/home/bio/Tools/INC-Seq/INC-Seq/utils/buildConsensus.py", line 207, in pbdagcon proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1207, in _execute_child if isinstance(args, types.StringTypes): RuntimeError: maximum recursion depth exceeded while calling a Python object

lch14forever commented 7 years ago

What if you replace the PBDAGCON binary with your version of PBDAGCON ($ mv utils/pbdagcon utils/_pbdagcon && cp `which pbdagcon` utils/)?

JaLuxe commented 7 years ago

good call!

Thank you a lot. It works :)

lch14forever commented 7 years ago

Could you let me know your PBDAGCON version?

The previous problem was because the older version somehow could go into infinite loop and the recursion was meant to fix that by trimming a few bases (although I don't understand how this fix it fully). And it seems the recursion went into infinite loop until stack overflow on your machine. I was not expecting the program to behave differently on different machines. I will leave this thread open, in case others face the same problem.

JaLuxe commented 7 years ago

I currently have no more access to the machine, but I downloaded it today via apt-get. Thus I think it should be the latest.

edit: formatting

haansi commented 7 years ago

Thanks for providing inc-seq and leaving this issue open :-)

For me the example call ./inc-seq.py -i data/inc_seq_test_read.fa -o consensus.fa resulted in issues with pbdagcon:

File "/../../INC-Seq/utils/buildConsensus.py", line 207, in pbdagcon proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1207, in _execute_child if isinstance(args, types.StringTypes): RuntimeError: maximum recursion depth exceeded while calling a Python object

replacing with the compiled file with https://github.com/PacificBiosciences/pbdagcon (Latest commit 87f393f) solved the issue as well

lch14forever commented 7 years ago

Hi,

I have found a problem with the pbdagcon binary previously uploaded. I have uploaded the latest version of pbdagcon and tested it locally.

Chenhao.

Szymonome commented 7 years ago

Found that BLASTN switch generates some information at the end of the read name e.g. Basecall_2D_7/0_730 while POA does not. Why is it like that? Moreover, what exactly does the number "7" stands for? Initially I thought it may be number or molecules used for sequence correction however, fasta file from GRAPHMAP switch have reads with numbers 2, 3, 4 or 5 that shouldn't happen as minimum amount should be 6, f I'm right.

lch14forever commented 7 years ago

Hi Szymonome,

I have opened another issue for your question https://github.com/CSB5/INC-Seq/issues/6