enasequence / sequencetools

Webin sequence validation API.
Apache License 2.0
10 stars 3 forks source link

Problem with the RL line #18

Closed Juke34 closed 6 years ago

Juke34 commented 7 years ago

I'm using the embl flat file validator embl-api-validator-1.1.156.jar to check the sanity of an EMBL file and I have seen a problem about the RL line. None of the example of RL line described into the EMBL User Manual documentation (here: ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/usrman.txt) works except this one:

 RL   Submitted (19-NOV-1990) to the INSDC.
 RL   M.A. Hughes, UNIVERSITY OF NEWCASTLE UPON TYNE, MEDICAL SCHOOL, NEW
 RL   CASTLE UPON TYNE, NE2  4HH, UK

I have seen that the developers of the gff3toembl tool have also used this trick using "RL Submitted (13-Jul-2016) to the INSDC." to pass the validator check.

In the past using "RL Unpublished." as specified into the documentation was sufficient for the validator.

Is that something undocumented yet that the line " RL Submitted (XX-FEB-XXXX) to the INSDC." is mandatory nowadays, or it's something that has to be improved/fixed into the validator to follow more carefully what the documentation stipulates?

raskoleinonen commented 7 years ago

Thank you. We will test examples from ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/usrman.txt against the parser. Also, we will investigate if we can remove the requirement to have any references (R* lines) present in the flat files.

kethireddy commented 7 years ago

Tested all RL line formats given in ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/usrman.txt and Validator supports all of them.

submitter reference mandatory check is blocking this issue (ERROR: Submitter references are mandatory in EMBL-BANK entries.)

Reference check will be excluded from next version of validator i.e. 1.1.174