stage_two output sample lost

brilliant2643 commented 1 year ago

Hi! I'm having some problems when I run the stage_two step. The input files number is 27 but in the output result I onlt got 25 samples. The command I use is as follow: args_oap stage_two -i output-arg -t 100 --e 0.00001 And the files are as follow: . ├── hx-g1.fasta ├── hx-g2.fasta ├── hx-g3.fasta ├── hx-s1.fasta ├── hx-s2.fasta ├── hx-s3.fasta ├── hx-w1.fasta ├── hx-w2.fasta ├── hx-w3.fasta ├── lx-g1.fasta ├── lx-g2.fasta ├── lx-g3.fasta ├── lx-s1.fasta ├── lx-s2.fasta ├── lx-s3.fasta ├── lx-w1.fasta ├── lx-w2.fasta ├── lx-w3.fasta ├── y-g1.fasta ├── y-g2.fasta ├── y-g3.fasta ├── y-s1.fasta ├── y-s2.fasta ├── y-s3.fasta ├── y-w1.fasta ├── y-w2.fasta └── y-w3.fasta

I have met this problem for several times in both stage_one output files and stage_two files but I handled it by changing the file names, so I'm wondering if there's some problem with the name of the files above? By the way, the hx-g3 and y-g3 is not in the output abundance files. Is there any suggestions about how to name the files? Thanks a lot, Brill

brilliant2643 commented 1 year ago

Hi! It's me again. I found the problem that cause this reason, it's maily because that the software cannot annotate the file. I run the command args_oap stage_two -i output_test_hx/ -t 20 --e 0.00001 and get the errror as follow: [2023-08-15 13:15:57] INFO: Processing <output_test_hx/extracted.fa> ... [2023-08-15 13:15:57] INFO: Extracting target sequences using BLAST ... [2023-08-15 13:15:57] INFO: BLAST settings: 83102 bps, 29 reads, 20 threads, mt_mode 0. [2023-08-15 13:15:58] INFO: Merging files ... [2023-08-15 13:15:58] CRITICAL: No target sequence remained after merging structure files, no further normalization will be made. But I notice that in stage_one output files there is indeed some result, all the output files for stage_one is not empty. So I'm still have no idea why I get such error, could you please help me find out.

Thanks a lot, Brill

xinehc commented 1 year ago

Hi,

It seems all your 29 reads in this file did not pass the filter in stage_two. Note that state_two has a more stringent threshold (compared to stage_one) so even if stage_one have results those reads could be discarded after running stage_two.

brilliant2643 commented 1 year ago

Hi,

It seems all your 29 reads in this file did not pass the filter in stage_two. Note that state_two has a more stringent threshold (compared to stage_one) so even if stage_one have results those reads could be discarded after running stage_two.

Oh I see it. That's why it's not in the final files, it makes sense. Thank you for your reply!

And by the way can I use binning results as the input files? Cause I also get this CRITICAL: No target sequence remained after merging structure files, no further normalization will be made. error when I run bins files. Sorry for the dumb question but it puzzled me for days.

Thanks again, Brill

xinehc commented 1 year ago

Hi,

args_oap does not support using contigs as inputs. If you want to annotate your binned contigs your may consider directly aligning them to the sarg database (see: https://smile.hku.hk/ARGs/Indexing/download) using e.g. blastn/blastx/diamond/bowtie/minimap (some require predicting ORF first).

xinehc / args_oap

stage_two output sample lost #36