idrblab / AnnoPRO

Feature map and function annotation of Proteins
MIT License
26 stars 8 forks source link

Attributes in diamond_scores.txt #21

Open Sourdoc opened 2 months ago

Sourdoc commented 2 months ago

Hi, I have installed the annopro model and used it for predicting function of an amino acid sequence. It generated many files including the diamond_scores.txt file. This files do not seem to have a header. So I was wondering what does each column indicate here. I understood that the first column is the name of the sequence provided and the second column is probably showing uniprot id. But what about the rest?

Please have a look at the first five rows of the diamond_scores.txt file: Secreted_Seq1 P32261 28.9 360 244 5 62 409 103 462 2.88e-34 134 Secreted_Seq1 Q60854 30.9 363 229 9 63 409 22 378 7.08e-34 131 Secreted_Seq1 P80229 32.3 372 227 10 56 409 14 378 3.59e-33 129 Secreted_Seq1 O08800 30.7 374 240 8 49 409 7 374 6.39e-33 128 Secreted_Seq1 Q9S7T8 32.2 369 215 15 67 409 30 389 8.55e-33 128

What does column 3 to 12 indicate? What are these numbers? How to interpret them?

Many thanks for your time and I look forward to hearing from you.

Best, Sourav