Open bebatut opened 7 years ago
I'd bet an expert review will be mandatory since there is no authoritative pyrrolysine-containing protein database. So we need some positive and negative controls, which could be provided by experts.
One strategy for positive control could be :
tblastn
)For negative controls, expertly selected bacterial genomes should be sufficient.
It could be a good idea. Better if we could automatize all the tasks: to limit any manual intervention and if the users want to test the tool. Do you think we can do that?
I ran some tests about validation (not really successful though).
I build a tiny protein database containing 29 known Pyl proteins (FASTA). Here is the DB.
results/test/conserved_potential_pyl_sequences.fasta
sha512sum
s differed. This means we cannot compare an expected result with a calculated result by comparing the files. Instead, we have to check the content of the files.
Hi,
We need to check the results of the prediction.
@keuv-grvl, @ylana Any idea how to do that?
Some ideas:
data
to the PYL proteins registered on the NCBIWe need to do that automatically (same thing to extract the genomes in
data
)Bérénice