nigyta / dfast_core

DDBJ Fast Annotation and Submission Tool
76 stars 14 forks source link

information about some fields in the "attribute" column (GFF) #9

Closed matrs closed 5 years ago

matrs commented 5 years ago

Hi all, first, thanks for your software, the installation process was very straightforward and until now everything has worked as expected. I was wondering about the meaning of some fields in the attribute column in the gff file. Specifically under the note element. For example, an entry reads:

note=WP_015064119.1 hypothetical protein (Bordetella bronchiseptica 253) [pid:43.3%25, q_cov:99.7%25, s_cov:98.4%25, Eval:1.1e-60]

What are the meanings of WP_015064119.1 and [pid:43.3%25, q_cov:99.7%25, s_cov:98.4%25, Eval:1.1e-60] ?. I cound't find any information about these on the internet.

As a suggestion, it would be very useful for all users if all the fields in the attribute column are explained in some way (at least briefly) in the help/manual file.

Regards

nigyta commented 5 years ago

Dear matrs,

Thank you for using DFAST.

note=WP_015064119.1 hypothetical protein (Bordetella bronchiseptica 253) [pid:43.3%, q_cov:99.7%, s_cov:98.4%, Eval:1.1e-60] This shows the result of an alignment to the reference sequence. pid and Eval represent percentage identity and E-value. q_cov and s_cov represents coverages against the query and subject (target) sequences, i.e. the percentages of the alignment length to the query and subject sequences, respectively.

When either of q_cov or s_cov is below 70%, the alignment is marked as 'partial hit'. When internal stop codons or frameshifts are found, they are also annotated in the note attribute.

WP_015064119.1 is an accession number of the reference sequence. You can search it at the NCBI webseite, https://www.ncbi.nlm.nih.gov/protein/WP_015064119.1 The ones with WP_ come from the NCBI RefSeq database, and there are some from UniProt.

Thank you for your suggestion. I'm gonna make a FAQ section to explain these attributes.

Yasuhiro

matrs commented 5 years ago

Thank you very much for your quick response nigyta.

nigyta commented 5 years ago

FAQ was added. 2d58d74f9b8cae7950a2935387d9fbb44afc34b9