openaire / guidelines-cris-managers

OpenAIRE Guidelines for CRIS Managers based on CERIF-XML
https://openaire-guidelines-for-cris-managers.readthedocs.io/
6 stars 16 forks source link

Usage of dates in the Patent entity is unclear #88

Closed abollini closed 2 years ago

abollini commented 4 years ago

The patent entity has two date element in the xml ApprovalDate and RegistrationDate according to the examples https://github.com/openaire/guidelines-cris-managers/blob/v1.1/samples/openaire_cerif_xml_example_patents.xml#L23 their names look confusing to me.

The ApprovalDate seems to refer to the Application Date in the EPO document, that many call also Priority Date https://worldwide.espacenet.com/patent/search/family/050033698/publication/BR112016010203A2?q=BR20161110203

the RegistrationDate is instead the Publication Date. Am I right? if so, renaming these elements in the xml would be probably the ideal solution but it will be not backward compatible, in any case it would be useful to add more information about such element in the guidelines and as comment in the example.

ACz-UniBi commented 4 years ago

Dear @abollini , thank you for pointing this out. The example can be a bit confusing here.

The ApprovalDate is more likely to be the publication date with the meaning of "Date on which the publication was made available to the public" and the RegistrationDate is rather refers to the transmission/submission of the document to the authority in charge, with the meaning of "Date on which the application was physically received at the Patent Authority"

I'm not an expert on patent metadata, but this would be my suggestion from the EPO metadata description.

For more clarification, we should update the example.

olli-gold commented 4 years ago

While the description of RegistrationDate sounds reasonable (although it could be SubmissionDate as well, which might be more clear), in my opinion ApprovalDate is not a very good name for an element, which is describing the date of publication and I aggree with @abollini, that it's quite confusing. The ApprovalDate should rather be the date the patent got effective, which is different from the date when it was published first. In this context I am missing an element really describing the date of approval of the patent in OpenAire. It should be optional, though, but it should exist at least (and if we would use the existing ApprovalDate field for that, we would be missing a date of publication, which is obviously mandatory for patents)...

So I would propose not only to update the example, but also to think about the labels/names of the elements and the introduction of a new field/element for the date the patent got effective.

ACz-UniBi commented 4 years ago

@olli-gold you're right. I should have written: The ApprovalDate is more likely to be the granted publication date with the meaning of "Date on which the granted publication was made available to the public"

But I'm notice from https://eurocris.org/cerif/feature-tour/cerif-16 , that the elements of

cfRegistrationDate and cfApprovalDate from cfResultPatent (put the dates in the links to the patent office)

is marked as "Deprecated Attributes -- to be removed in a future release:"
For further consideration, I would suggest to look at the CERIF model as well. Maybe there will be some further changes.

jdvorak001 commented 4 years ago

Yes, the ApprovalDate is the date when the patent was granted by the patent office.

RegistrationDate could perhaps be more aptly called FilingDate, the date when the patent application was submitted to the patent office. So it should precede the ApprovalDate. However, renaming a field is difficult operation (breaking backward compatibility), not to be done on a micro update of the spec. So we suggest to add a description in the specification and put the alternative name there.

Yes, these attributes are marked deprecated in the CERIF model. The idea is to document the lifecycle of the patent using StartDates on the links between the patent and a semantic term describing the state of the patent (in the CERIF semantic layer). So e.g. the publication date of a patent would be the StartDate from the link between the patent and the term "Published" from a (tentative) "PatentStatuses" vocabulary.

The fact that PublicationDate would probably be the most widespread date you'd have on patents escalates the priority of the change.

abollini commented 4 years ago

Hi @jdvorak001 if I understood correctly our example has inverted the dates, I have created a pr to fix the example and include the description as by the discussion above

olli-gold commented 3 years ago

Thank you for the PR, @abollini! But in my opinion it does not reflect the discussion completely, as (following the explanation of @jdvorak001) the ApprovalDate is not the date of publication, but the date of approval and the date of publication is still missing in the specification (which is probably not good, because it's the probably the most widespread date on patents, as @jdvorak001 also wrote). As those attributes are deprecated anyways, I guess it won't make sense to add this attribute without further discussion at this point, though...

abollini commented 3 years ago

@olli-gold @jdvorak001 I have updated the linked PR hoping to resume this discussion

abollini commented 2 years ago

@jdvorak001 can you take a look to https://github.com/openaire/guidelines-cris-managers/pull/89 it would be nice if we can unblock this issue

jdvorak001 commented 2 years ago

Resolved by #89 and #106