sanger-pathogens / ariba

Antimicrobial Resistance Identification By Assembly
http://sanger-pathogens.github.io/ariba/
Other
167 stars 53 forks source link

problems with resfinder and vfdb #325

Open fedeserral opened 2 years ago

fedeserral commented 2 years ago

Hi! I have some problems when I try to download resfinder and vfdb_full. In the case of resfinder this is the following error when I run ariba prepareref: ariba.reference_data.Error: Sequence "erm(N).2_MZ015744" found in input fasta file but not in metadata file. Cannot continue

For vfdb I have a similir error I think: ariba.reference_data.Error: Duplicate name "stx2B.VFG000838(gb|WP_000738068).Escherichia_coli_O157:H7_str._EDL933" found in file /data/out.vfdb_full.fa. Cannot continue)

Can you help me to solve it?

Thanks Regards.

JFsanchezherrero commented 2 years ago

Hi there I came with the same issue and I think I found a solution at least for VFDB error.

Check it out here in the issue I generated in BacterialTyper here https://github.com/HCGB-IGTP/BacterialTyper/issues/19

Thanks Regards

evezeyl commented 2 years ago

getting vfdb_full and it downloads duplicated sequences. Same name, same sequence Both in the fasta (.fa) and description (.tsv) Would be nice of Ariba could automatically filter out duplicates.