CSB5 / INC-Seq

INC-Seq: Accurate single molecule reads using nanopore sequencing
Other
13 stars 4 forks source link

Error #3

Closed Szymonome closed 8 years ago

Szymonome commented 8 years ago

Hello,

Last week I was testing INC-Seq software with your data "ladder_rep_1" and after processing read 73 I got an error - please see below. It looks like software worked fine for the first 72 reads. Do you know why is it like that?

Regards Szymon

---------- Processing read 72 ---------- Max number of segments found: 0 Consensus construction failed! ---------- Processing read 73 ---------- Max number of segments found: 7 Number of segments of the candidate strech: 7 Candidate read found! Traceback (most recent call last): File "/home/opt/INC-Seq/inc-seq.py", line 160, in sys.exit(main(sys.argv[1:])) File "/home/opt/INC-Seq/inc-seq.py", line 147, in main args.seg_cov, args.iterative) File "/home/opt/INC-Seq/inc-seq.py", line 24, in callBuildConsensus seg_cov, iterative) File "/software/INC-Seq/utils/buildConsensus.py", line 288, in consensus_blastn consensus = pbdagcon(tmpname+'.m5', 0) File "/software/INC-Seq/utils/buildConsensus.py", line 206, in pbdagcon proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/home/opt/.pyenv/versions/2.7.8/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/home/opt/.pyenv/versions/2.7.8/lib/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

lch14forever commented 8 years ago

Hi Szymon,

I was not able to reproduce your error. Could you kindly send me the 73rd read causing the error from your fasta file together with the command you used to run the pipeline?

Thanks. Chenhao.

Szymonome commented 8 years ago

Hi Chenhao,

We run INC-Seq on CentOS 6.3, and python 2.7.8 would it make any difference?

Remaining programs have been installed according to your requirements: Biopython 1.65 and BLAST 2.2.28+

Command I used: /home/opt/INC-Seq/inc-seq.py -i pass.fasta -o pass.out

Read 73:

2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d GISNB474_10bacRCAsheared091215_4947_1_ch100_file68_strand pass/GISNB474_10bacRCAsheared091215_4947_1_ch100_file68_strand.fast5

CCGTGGTTATACTTAGCCCGGAAGACAACCTTACCAAATCTTGACATCCTTTGACACTCTAGGATAGAGCCTTCCCCTTCGGGGACAAAGTGACAGGTGTGGCATGGTTGTCAGCTCGTGTCACGCTTTCTAAAGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTGATGAAAGGTCTTCTGGATCGTAAAACTCTGTTATTAGGGAAGAACATATGTGTAAGTAACTGTGCATCTTGACGGTACCTAAGGCCGAAAGCCACGGCTAACACGTGCCAGCAGCCCGGTAATACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGTAGGCGGTTTAAGTCTGATGTGAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGAAAACTTGAGTGCAGAACAGGAAAGTGGAATTCCATGTGTAGCGGTGAAAAATGCGCGGATATGGAGGAACACCAGTGTGAAGGCGACTTTCTGGTCTGTAACTGACGCTGATGTGCGAAAGCGTGGGGATCAAACAGGATTAGATACCTTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCTTAGTGCTGCAATGGACCCGCATTAAGCACTCCGCCTGGGGAGTACGACCGCAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGGCAACCGGACGCGAAGAACCTTACCAAATCTTGACATCCTTTGACAACAACTCTAGAGATAGAGCCTTCCCCTTCGGGGACAAATGACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGCTGTCCTCCACGGGAGGCAGCGATCAGGGAATTCCGCGAAACTGGCAAGCTGAGGCGCCAGTAGTATGAAGGTTCGGATCGTAAACTCTGTTATTAGGAAGAACATATGTGATATGTGCACATCTTGACGGTACCTAATGAAGACGCTAACTACGTGCCAGCAGCCCGCGGTAGGTACCACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGGGCGCGTAGGCGGTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCCGTGGAGGTCATTGGAATCTGGAAACTTGAGTGGAGCAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGTGAAATGCGCAGAGATATGAGGAACACCAGTGGCGAAGGCGATTCTGGTCTGTAACTGACGCTGATGTGCGAAAGTGTGGGGGATCAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGATTCCGCCCCTTAGCTGCTGCGCAGCTAACGCATTTGAGGCCGCTCCGCCTGGGGAGTACGACCGCAGGTTGAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAGCATGTGGTTTAATGCGAAGCAACGGGGTGAAAGAACCTTACCACAAATCTTGACATCCTTTGACAACTCCGAGACCAGCCTTCCCCTTCGGGGGACAAAGTGACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGCTTCTACGGGAGGCAGCAGTAGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTGATGAAGGTCTTCGGATCGTAAAAACTCTGTTATTAGGGGAAGAACATATGTGTAATAACTGTGCACATCTTGACGGTATAAGATTACAGAAAGCCACGGCTAACTCGTGCCAGCACCCCCGGGGCGGTAATACGTAGGTGCAAGCGTTATCCGGAATTATTGGGCGTAAGGGCGCGTAGGCGGTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGAAAACTTAATATTAAGAAGAGGAAAGTGGAAATTCCATGTGTAGCACGTGAAATCCCAGAGATATGGAGGAACACCAGTGGCGAAGGCGACTTTCTGGTCTTATGACTGACGCTGATGTGCGAGAAAGCGTGGGGGATCAAACAGGATTAGATACCCCTGGTAGTCCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGTTTTCCGCACGCTGATCAACTGCATAGGCATTCCACACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAAAACCTGCAAAGGAATGACGGGGACCCGCACAAGCATCTTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAAATCTTATGACCGCCTTTGACACTCTAGAGATAGAGCCTTCCCCTTCGGGGGACAAAGTACCTAGGTTGCATGGTTGTCGTCAGCTCGTGTCGGTCCTCCCGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGGCGAAGCCTGACGGAGCAACGCCGCGTGAGTGATATGAAGGTCTTCGATCGTAAAACTCTGTTATTAGGGAAAGAACATATGTGTAAGTAACTGTGCACATCTTGACGGTACGGATCAGAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCGCAAGCGTTAGCGGAATTATTGCGTAAGGGCGCGGTAGGGCGGTTTAAGTCTGATGTGAAAGCCCACGGCTCACCGTGTTGGGGGGTCATTTGGAAATGGGAAAACTTGAGTGCAGAAGAAAGTGGAATTCCATGTGTAGCGTGAAATGCGCAGAGATATGGAGGAACACCAGTGGCAAAGCGACTTTCTGGTCTGTAACTGACGCTGATGTGCGAAAGCGTGGGGATCAAACACCCAAGTCGATACCCCTGGTAGTCCACCGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCTTAGTGCGCTGCAGCACTGGAAGTTAAGCACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAACCTGCCAAAGGAATTGACTGGATAGGGACAAGCGGTGGAGCATTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAAATCTTGACATCCTTTGACAACTCTAGAGATAGAGCCTTCCCTGTCGGGGAGACAAAGTGACAGGTGGTGCATCGGTTGTCGTCAGCTCGTGTCGCTTCTACGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGGTGAAAGCCTGACGGAGCAACGCCGCGTGAGTGATGAAGGTCTTCGATCGTAAACTCTGTTATTAGGGAAGAACATATGTGTAAGTAACTGTGCACATCTTGACGGTACCTAAGATCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGCGGTAATACGTAGGTGGCAGAGCGTTGATACATAGGGAATTATTGGGCGTAAGCGCGCGTAGGCGGTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGAAAACTTGAGTGCAGAAAGAGGAAAGTGGAATTCCATGTGTGTACCTCGGTGAAATGCGCAGAGATATGGAGGAACACCAGTGGCGAAGGCGACTTTTCTGGTCTGTAAAACCTGACGCTGATGTGCGAAAACGTGGGGATCAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCTTAGTGCTGCGCAGCTAACGCATTAAGCACTCCGCCTGGGGGGAGTCACGACCGCAAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAGCAACGCGAAGAACCTTACCAAATCTTGACATCCTGTCTTTGACAACTCTAGAGATAGAGCCTTCCCCTTCCTAGGCAAACAAAGTGACAGGTGCATGGTTGTCGTCAGCTCGGATTGCTTCTACGGGAGGCAGCAAGTGAGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTGATGAAGGTCTTCGCTGTAAAACTCTGTTATTAGGGAAGAACATATGTGTAAGTAACTGTGCACATCTTGACGGTACCTAAGATCTACAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGGTAATACGTAGGTGGCAAGCGTTATCCGGGGGAATTATTGGGCGTAAAGCGCGCGGTAGGCTTTTTAAGTCTGATGTGAAAGCCCACGCTCAACCGTGGAGGTGCTATTGGAAACTGGAAAACTTGAGTGCCAAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGGTGCGAAATGCGCAGAGATATGGAGTAACGAGTGGCGAAGGCGACTTTCTGGTCTTAACTGACGCTGATGTGCGAAAGCCGTGGGGGATCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTCCGCCTTAGTGCGCAGCTAACGCATTAAGCACTCCGCCTGGGGGAGTACGACCGCAGGTTGAAACTCAAAGGAATTGACGGCACTGGAACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCGGACCGAAGAACCTTACCAAATCGTATTGACATCCTTTGACAACTCTAGAGATAGAGCCTTTCCCCTTCGGGGGACAAAGTGACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGCTTCTACGGGAGGCAGCAGTAGGGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTATCATCGAAGGTCTTCGGATCGTAAAACTCTGTTATTAGGGAAGAACATATGTGTAATGACTGTGCACATCTTGGACGTGGACAGTGATAAAGCCGGCTATAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGCAGGCGTTATCCGGAAATTATTGGGCGTAAAGGGCGCGTAGGCATTTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGTGCCATTGGCGAACTGGAAAACTTGAGTGCAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGTGAAATGCGCAGAGATATGGAGGAACACCAGTGGCGAAGGCGACTTTCTGGTCTGTAACTGACGCTGATGTGCGAAGCGTGGGGAAGTCCAAACAGGATTAGATACCCTAGTAGTCCCACGGGCCTGACGATGAGTGCTAAGTGTGTTAGGGGGTTTCCGCCCCTGCCCTGTGCAGCTACGCATTATTAAGCACTCTCCGCCTGGGGGAGTACGACCGCAAGGTTGAAACTCAAAGGAATGACGGGGACCCGCACAAGCGGTGGAGCCGTGGTTTAATTCGAAGCAACGCGAAGATCTATTACCAAATCTTGACATCCTTTGACAACTCTAGAGATAGAGCCTTCTCGTCGGGGACAAAGTGACAGGTGGTGCATGGTTGGTCAGTGCTTCTACGGGAGGCAGCAGTAGGGAATCTTCCGCAAACTTGGGCGAAAGCCTGACGGAGCAAACGCCGCGTGAGTGATGAAGGTCTTCGTGGATCGTAAACTCTGTTATTAGGGAAGAACATATGTGTAAGTAACTGTGCACATCTTGACGGTACGGATCAGAAAGCCCACGGCTAACTTACGTGCCAGCAGCCAGCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGTAGGCGGTTTTAAGTCTGATGTGAAGCCCCACGGCTCAACCGTGGAGGGTCGTGGAAACTGGAAGCTCTTGGCTAACTTGAAAGAGGAAAGTGGAATTCCATGTGTAGCGTGAAATGCGCAGAGATATGGCCCAGGAACACCAGTGGCGAGGCAACTTTCTGGTCTGTAACTGACGCTGATGTGCGAAAGCGTGTGGGGATCGAACAGGATTGATACCCTGGTAGTCCACGGATCAACGATGAGTGCTAAGTGTTAGGGGTTTCGGCCCCTTAGTGCTGTGCAGCTAACGCATTAAGCGGCACTCCGCCTGGGGAGTACGACCGCAGATCCGAGTAAAGGAAGTCGCACGAAACTAGACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAATGCGAAGAACCTTACCAAATTATTACACCACATGATACGTTTATTTCC

When used: /home/opt/INC-Seq/inc-seq.py -i data.fa -o pass.out -a graphmap -m 500

Program stopped even faster: ---------- Processing read 31 ---------- Max number of segments found: 6 Number of segments of the candidate strech: 6 Candidate read found! Traceback (most recent call last): File "/home/opt/INC-Seq/inc-seq.py", line 160, in sys.exit(main(sys.argv[1:])) File "/home/opt/INC-Seq/inc-seq.py", line 147, in main args.seg_cov, args.iterative) File "/home/opt/INC-Seq/inc-seq.py", line 28, in callBuildConsensus seg_cov, iterative) File "/software/INC-Seq/utils/buildConsensus.py", line 374, in consensus_graphmap consensus = pbdagcon(tmpname + '.m5', 0) File "/software/INC-Seq/utils/buildConsensus.py", line 206, in pbdagcon proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) File "/home/opt/.pyenv/versions/2.7.8/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/home/opt/.pyenv/versions/2.7.8/lib/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

The only time when INC-SEQ worked fine was with POA switch: /home/opt/INC-Seq/inc-seq.py -i data.fa -o pass.out -a poa -m 500

However, Graphmap and Blast crash every time.

Kind regards Szymon

lch14forever commented 8 years ago

Hi Szymon,

Do you have PBDAGCON in your path?

Chenhao

Szymonome commented 8 years ago

Chenhao,

Yes, we have: export PATH=/home/opt/pacb/bin:$PATH ls /home/opt/pacb/bin bam2bax ccmake gfortran h5dump h5repart pbindexdump toe bam2plx clear gif2h5 h5import h5stat pbmerge tput bam2sam cmake h52gif h5jam h5unjam pls2fasta tset bax2bam cpack h5c++ h5ls infocmp reset blasr cpp h5cc h5mkgrp infotocap samtools captoinfo ctest h5copy h5perf_serial loadPulses sawriter CC g++ h5debug h5redeploy ncurses6-config tabs ccache gcc h5diff h5repack pbindex tic

Do we need anything else?

Szymon

lch14forever commented 8 years ago

What if you run "pbdagcon" directly in your terminal? If you see an error about "command not found", you can try to clone https://github.com/PacificBiosciences/pbdagcon, compile it and add to your path.

Chenhao.

lch14forever commented 8 years ago

Hi Szymon,

At the meantime, I have also pushed a latest commit, which includes the binary of pbdagcon used for our manuscript. Can you test it out?

Chenhao.

Szymonome commented 8 years ago

Chenhao,

It does work now, specified it with: export PYENV_ROOT="/home/opt/.pyenv" export PATH="$PYENV_ROOT/bin:$PATH" eval "$(pyenv init -)" export PYTHONPATH=/home/opt/INC-Seq/utils:$PYTHONPATH export PATH=/home/opt/pacb/bin:$PATH export PATH=/home/opt/pbdagcon/src/cpp:$PATH /home/opt/INC-Seq/inc-seq.py --help

Analysed over 250 reads however, my output file is still empty - is that correct?

Looks like some of the data is being saved in logs folder.

Szymon

lch14forever commented 8 years ago

I am not sure about the "logs folder" you are referring to. Are you using bpipe to run INC-Seq?

Szymonome commented 8 years ago

I used basic command: /home/opt/INC-Seq/inc-seq.py -i pass.fasta -o pass.out

Program generated folder called logs with one file inside: file myeasylog.log

2016-06-07 00:47:04,498 INFO [default] Multi-threaded. Input: /dev/shm/incseq_pass.fasta_2016-06-07_00-46-47.733103/3c0475d65c347bb420808531b425e5ce.tmp.m5, Threads: 4 2016-06-07 00:47:04,500 DEBUG [Reader] [szymonome@becker.eng.gla.ac.uk] [FUNCTION] [FILE:0] Consensus candidate: 2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d_7 2016-06-07 00:47:04,501 INFO [Consensus] Consensus calling: 2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d_7 Alignments: 6 2016-06-07 00:47:10,218 INFO [default] Multi-threaded. Input: /dev/shm/incseq_pass.fasta_2016-06-07_00-46-47.733103/589b2c2a28cbbdb58343a9ebed958a67.tmp.m5, Threads: 4 2016-06-07 00:47:10,219 DEBUG [Reader] [szymonome@becker.eng.gla.ac.uk] [FUNCTION] [FILE:0] Consensus candidate: 0efc0fab-903e-4489-923f-b36e553bdd38_Basecall_2D_2d_6 2016-06-07 00:47:10,219 INFO [Consensus] Consensus calling: 0efc0fab-903e-4489-923f-b36e553bdd38_Basecall_2D_2d_6 Alignments: 5

However, pass.out is still empty.

lch14forever commented 8 years ago

Hi Szymon,

This does not seem to be the logging style of INC-Seq. It also looks weird to me that multi-threading was used. A typical run of the INC-Seq should produce something like the follows (with the read you sent me). Did you get any output if you run the test case in data/inc_seq_test_read.fa?

00:07:21|lich@n067|tmp$ ~/projects_backup/INCSeq/inc-seq.py -i tmp.fa ---------- Processing read 1 ---------- Max number of segments found: 7 Number of segments of the candidate strech: 7 Candidate read found! Consensus called 2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d Number of segments 7

2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d_7/0_763 GTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAAATCTTGACATCCTTTGACAACTCTAGAGATAGAGCCTTCCCCTTCGGGGGACAAAGTGACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGCTTCTACGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTGATGAAGGTCTTCGGATCGTAAAACTCTGTTATTAGGGAAGAACATATGTGTAAGTAACTGTGCACATCTTGACGGTACCTAAGATGAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGGGCGCGTAGGCGGTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGAAAACTTGAGTGCAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGTGAAATGCGCAGAGATATGGAGGAACACCAGTGGCGAAGGCGACTTTCTGGTCTGTAACTGACGCTGATGTGCGAAAGCGTGGGGATCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCTTAGCGCAGCTAACGCATTAAAGCACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGA

Szymonome commented 8 years ago

Test file generated correct output:

/inc-seq.py -i inc_seq_test_read.fa -o out.fa ---------- Processing read 1 ---------- Max number of segments found: 20 Number of segments of the candidate strech: 20 Candidate read found! Warning: PBDAGCON timeout! Trimming 1 base(s). Consensus called ddfdd3f2-c50b-4843-b1f2-c1669785858a_Basecall_2D_2d Number of segments 20

So in case of ladder_rep_1 data I have to wait until all reads will be processed and then -out file will get updated? It looks like out file does not update itself automatically when consensus read was generated. Below consensus was called on read 25 but pass.out file is still empty.

---------- Processing read 25 ---------- Max number of segments found: 7 Number of segments of the candidate strech: 7 Candidate read found! Consensus called 2d46a7a6-0bb5-4994-b1bb-bc8494695cbb_Basecall_2D_2d Number of segments 7 ---------- Processing read 26 ---------- Max number of segments found: 2 Not enough alignmets! Consensus construction failed! ---------- Processing read 27 ---------- Max number of segments found: 6 Number of segments of the candidate strech: 6 Candidate read found!

Regards Szymon

lch14forever commented 8 years ago

For cases like read 25 you showed, consensus should have been called. I think probably your system buffer has not been flushed such that nothing was written to your output file. You can test it with a few reads (probably a few long reads like read 25) first and see if you get the output fasta file.

Szymonome commented 8 years ago

Subsampled 100 reads and it looks like program works fine now. INC-Seq generated 3 concatemrised molecules. Thank you very much for your help!

I looked through INC-Seq code and found section where you use primer sequence for concatemer split. However, this part of code is not available for use at the moment. Are you still planning to use that mode in the future? Did you get any good results with that approach?

lch14forever commented 8 years ago

That is great!

Initially we thought we could detect the primer sequence from corrected reads to resolve the correct orientation. However, I found that the primer sequence could only be detected in very limited number of consensus sequences, possibly due to the lost of partial primer sequence during the library preparation. So I would not restore that function. If you are interested, you can checkout for the versions before commit "fc20bdc8cac50281c26f833f945d79ef81b0aa77", which should have the implementation of recovering the orientation based on primer sequences.

For 16S classification, the consensus reads could just be mapped to a reference database, e.g. SILVA with BLASTN to restore the correct orientation. Another trick (we used in the manuscript) is to concatenate the consensus twice, which should theoretically restore the correct orientation.