Problem:
Top protein or nucleotide hit lists contain duplicate entries. For example
Cause:
The processing of top hits from the BLASTp job will separately count organisms that have a unique accession, or multiple TaxIDs. For organisms with a representative genome in NCBI's RefSeq collection, this will result in duplicate organisms with identical TaxIDs, but unique accessions. For some organisms with many representative genomes in the database, they will have been assigned multiple TaxIDs, each with a unique accession. Both these cases will result in what appear to be duplicates in the Top hits list. The user should verify that entries are in fact, representing the same organism. The number of top hits displayed in the output list can be adjusted by the user when running the relatedness tool.
Problem: Top protein or nucleotide hit lists contain duplicate entries. For example
Cause: The processing of top hits from the BLASTp job will separately count organisms that have a unique accession, or multiple TaxIDs. For organisms with a representative genome in NCBI's RefSeq collection, this will result in duplicate organisms with identical TaxIDs, but unique accessions. For some organisms with many representative genomes in the database, they will have been assigned multiple TaxIDs, each with a unique accession. Both these cases will result in what appear to be duplicates in the Top hits list. The user should verify that entries are in fact, representing the same organism. The number of top hits displayed in the output list can be adjusted by the user when running the relatedness tool.