Closed defendant602 closed 4 years ago
Do you have the possibility to share with me the ./clustering/fastq_files/1823.fastq
file (e.g. by sending it to me by email)? In that case, I can take a look at it right away.
Thanks for the quick reply! It's absolutely ok to share the fastq file with you, may I have your email address?
ksahlin [at] kth [dot] se. The address should show if you click on my profile.
If too large we could find another medium to share files. Just let me know.
The 1823.fastq is actually very small in file size, I think it won't be necessary to send it to you by email. I have uploaded it as an attachment file in this comment.
Thanks!
I have maybe fixed the bug now in the new version v0.0.5. The new version is available on pip and here on GitHub in latest master commit 563d0a2.
The reason I say "maybe" is that I didn't observe an identical runtime error that you reported for this instance. However, I did get a runtime error in the same region of the code, which is now fixed. So this is somewhat strange.
However, give the new version a try and see if it works also for you. In addition, the fix could slightly improve accuracy, so you might want to rerun it on all your data (although it should be a very minor improvement in that case).
Thanks for reporting and let me know if it solves the issue!
For logging purposes, my error was:
python -m pyinstrument isONcorrect --fastq /Users/kxs624/tmp/ISONCORRECT/user_bug1/fastq/1823.fastq --outfolder /Users/kxs624/tmp/ISONCORRECT/user_bug1/out/ --verbose
Traceback (most recent call last):
File "/Users/kxs624/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/Users/kxs624/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/kxs624/anaconda3/lib/python3.6/site-packages/pyinstrument/__main__.py", line 156, in <module>
main()
File "/Users/kxs624/anaconda3/lib/python3.6/site-packages/pyinstrument/__main__.py", line 87, in main
exec_(code, globs, None)
File "isONcorrect", line 1098, in <module>
main(args)
File "isONcorrect", line 991, in main
corrected_seq, other_reads_corrected_regions = correct_read(seq, reads, intervals_to_correct, k_size, work_dir, v_depth_ratio_threshold, max_seqs_to_spoa, args.disable_numpy, args.verbose)
File "isONcorrect", line 774, in correct_read
best_corr, other_corrections = get_best_corrections(instance, reads, k_size, work_dir, v_depth_ratio_threshold, max_seqs_to_spoa, disable_numpy) # store all corrected regions within all reads in large container and keep track when correcting new read to not re-compute these regions
File "isONcorrect", line 567, in get_best_corrections
return curr_read_corr[k_size:-k_size], other_corrections_final
UnboundLocalError: local variable 'curr_read_corr' referenced before assignment
Yes, it solved my problem and it runs well ! Thanks again!
Hi, Thanks for developing this great software for reads correction. I have run isONcorrect pipeline on my own ONT cDNA data, but I got the following error:
subprocess.CalledProcessError: Command '['/usr/bin/time', '/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/isONcorrect', '--fastq', './clustering/fastq_files/1823.fastq', '--outfolder', './correction/1823', '--exact_instance_limit', '50', '--set_w_dynamically', '--k', '9', '--w', '10', '--xmin', '14', '--xmax', '80', '--T', '0.1']' returned non-zero exit status 1.
It run well on other clusters, but it went wrong on 1823.fastq. Whe I run this command alone, the error info is:
Too abundant: TATATATAT ACACATATA 13 12 Too abundant: ATATATACA TATATATAT 13 12 Too abundant: ATATACACA TATATATAT 13 12 Too abundant: CACTCCAGC AAAAAAAAA 13 12 Too abundant: ACTCCAGCC AAAAAAAAA 13 12 Average abundance for non-unique minimizer-combs: 3.2074067588863597 Number of singleton minimizer combinations filtered out: 90418 Traceback (most recent call last): File "/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/isONcorrect", line 1098, in
main(args)
File "/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/isONcorrect", line 991, in main
corrected_seq, other_reads_corrected_regions = correct_read(seq, reads, intervals_to_correct, k_size, work_dir, v_depth_ratio_threshold, max_seqs_to_spoa, args.disable_numpy, args.verbose)
File "/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/isONcorrect", line 774, in correct_read
best_corr, other_corrections = get_best_corrections(instance, reads, k_size, work_dir, v_depth_ratio_threshold, max_seqs_to_spoa, disable_numpy) # store all corrected regions within all reads in large container and keep track when correcting new read to not re-compute these regions
File "/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/isONcorrect", line 467, in get_best_corrections read_alignment, ref_alignment = help_functions.cigar_to_seq(cigar_string, seq, spoa_ref) File "/export/pipeline/RNASeq/Software/isONcorrect/isONcorrect-master/modules/help_functions.py", line 43, in cigar_to_seq result = re.split(r'[=DXSMI]+', cigar) File "/export/software/Base/python/Python/Python-3.6.3/lib/python3.6/re.py", line 212, in split return _compile(pattern, flags).split(string, maxsplit) TypeError: expected string or bytes-like object 1.99user 4.93system 0:01.79elapsed 386%CPU (0avgtext+0avgdata 87308maxresident)k 0inputs+392outputs (44major+62262minor)pagefaults 0swaps
In code <read_alignment, ref_alignment = help_functions.cigar_to_seq(cigar_string, seq, spoa_ref)>, cigar_string is actually None type, rather than a string.
Could you please take a moment to check this error out? Thanks very much!