Open grassking100 opened 4 years ago
I agree, it does look inconsistent and it's likely buggy. There are a few confusing issues here, looking back at the code I wrote for those options, I noticed that:
-e
option only affects exon level stats; the program usage description (-h
) actually suggests that ("exon accuracy"), but the web manual incorrectly states that transcript level accuracy is also affected by -e
, which is actually not implemented in the code (!)--strict-match
option does not affect the accuracy assessment (the .stats
output) at all (!), that option was only added for the purpose of transcript classification data: when used, that option introduces the classification code ~
and changes the meaning of class code =
to mean "exact coordinate match for every exon" (not just every intron, as ~
means when this option is used). So theoretically this option should have at least caused the class code ~
to be shown instead of =
, in the .tracking
and the .annotated.gtf
output files -- but it does not! (due to a bug in my code, I see now..). So I'll have to fix the --strict-match
option and I guess I should also make it affect the transcript level stats, as you expected.. As for -e
, I think I should make it affect transcript level accuracy as well - as documented in the manual.
Thank you for reporting these bugs/inconsistencies while providing very nice and clear example files.
Hello, Recently, I was running the GFFcompare and found some confusing results. The query file has one single-exon transcript which is slightly different from the reference file at the boundary, and the query file has also one multiple-exon transcript which is slightly different from the reference file at the boundary. When I using GFFCompare with "--strict-match -e 0 -d 0", I was expected that the transcript level of two data should be 0% due to the error at the boundary, but the results have shown 100%. What are the possible reasons this might happens?
Thank you grassking100
gffcmp.stats
subanswer.gff3
subpredict.gff3