Closed matrs closed 5 years ago
Dear matrs,
Thank you for using DFAST.
note=WP_015064119.1 hypothetical protein (Bordetella bronchiseptica 253) [pid:43.3%, q_cov:99.7%, s_cov:98.4%, Eval:1.1e-60]
This shows the result of an alignment to the reference sequence.
pid
and Eval
represent percentage identity and E-value. q_cov
and s_cov
represents coverages against the query and subject (target) sequences, i.e. the percentages of the alignment length to the query and subject sequences, respectively.
When either of q_cov
or s_cov
is below 70%, the alignment is marked as 'partial hit'.
When internal stop codons or frameshifts are found, they are also annotated in the note attribute.
WP_015064119.1
is an accession number of the reference sequence.
You can search it at the NCBI webseite, https://www.ncbi.nlm.nih.gov/protein/WP_015064119.1
The ones with WP_
come from the NCBI RefSeq database, and there are some from UniProt.
Thank you for your suggestion. I'm gonna make a FAQ section to explain these attributes.
Yasuhiro
Thank you very much for your quick response nigyta.
FAQ was added. 2d58d74f9b8cae7950a2935387d9fbb44afc34b9
Hi all, first, thanks for your software, the installation process was very straightforward and until now everything has worked as expected. I was wondering about the meaning of some fields in the attribute column in the
gff
file. Specifically under thenote
element. For example, an entry reads:note=WP_015064119.1 hypothetical protein (Bordetella bronchiseptica 253) [pid:43.3%25, q_cov:99.7%25, s_cov:98.4%25, Eval:1.1e-60]
What are the meanings of
WP_015064119.1
and[pid:43.3%25, q_cov:99.7%25, s_cov:98.4%25, Eval:1.1e-60]
?. I cound't find any information about these on the internet.As a suggestion, it would be very useful for all users if all the fields in the attribute column are explained in some way (at least briefly) in the help/manual file.
Regards