xinehc / args_oap

ARGs-OAP: Online Analysis Pipeline for Antibiotic Resistance Genes Detection from Metagenomic Data Using an Integrated Structured ARG Database
MIT License
36 stars 11 forks source link

Regarding output of stage two #35

Open chanchalrana opened 11 months ago

chanchalrana commented 11 months ago

I ran ARGs_OAP on 200 sequences (paired end). The step one ran smoothly and gave me two files as output (metadata.txt and extracted.fa) but when I ran stage two, it gave me the error as shown in the screenshot below: args_oap

The command I gave for stage one is: args_oap stage_one -i input -o output -f fa -t 8 and for stage two is: args_oap stage_two -i output -t 8

Kindly help me to resolve this issue. Thank you in advance

xinehc commented 11 months ago

Hi,

it means your blast output file contain something that cannot be properly parsed:

df = pd.read_table(self._blastout, header=None, names=settings.cols)
if df.isnull().any(axis=None):
    logger.critical('BLAST output file <{}> cannot be parsed. Please check BLAST output file (--blastout).'.format(self.blastout))
    sys.exit(2)

You may consider manually load the blast output file ('output/blastout.txt') and see what is happening:

import pandas as pd
df = pd.read_table('output/blastout.txt', header=None)
df[df.isna().any(axis=1)]
chanchalrana commented 11 months ago

Earlier, I gave both commands simultaneously and it gave me the above error. I tried to run the stage two command separately, it worked just fine and I got the result.

Thank you so much for responding.

Also, Is there any issue if I give around 4k sequences together in one stretch? Before running 200 sequences, I tried with 4k sequences altogether. The first command (stage one) gave me the result. However, the stage two command (after 12-13 days) gave the error as : Killed.

So, I started with a batch of 200 sequences (not risking it to give a second chance for 4K sequences by running it again as it took a lot of time).

Kindly suggest some alternative.