enasequence / sequencetools

Webin sequence validation API.
Apache License 2.0
10 stars 3 forks source link

Problem with RA and RG lines #17

Closed Juke34 closed 7 years ago

Juke34 commented 7 years ago

I'm using the embl flat file validator embl-api-validator-1.1.156.jar to check the sanity of an EMBL file and I got this error message: ERROR: Reference author (RA) and Reference consortium (RG) missing for RN [1] (CitationExistsCheck_2).

If I fill the RA line or the RG line, the error disappears... BUT

But the EMBL User Manual documentation (here: ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/usrman.txt) says: RG - reference group (>=0 per entry) RA - reference author(s) (>=0 per entry) So, these fields shouldn't be mandatory...

Later the same document says: 3.4.10 The Reference (RN, RC, RP, RX, RG, RA, RT, RL) Lines These lines comprise the literature citations within the database. The citations provide access to the papers from which the data has been abstracted. The reference lines for a given citation occur in a block, and are always in the order RN, RC, RP, RX, RG, RA, RT, RL. Within each such reference block the RN line occurs once, the RC, RP and RX lines occur zero or more times, and the RA, RT, RL lines each occur one or more times. If several references are given, there will be a reference block for each.

Opposing what has been said first, it is say in this paragraph that RA has to occur one or more times (>=1) and nothing about RG is specified... What is the truth ? The validator checks currently (RG >= 1 or RA >= 1)...

There is obviously some editing to do to the EMBL User Manual documentation, could you report them the problem ?

Juke34 commented 7 years ago

So, the validator is right ! This was not really obvious for me but the documentation means the same thing... having RG or RA is mandatory.