ejp-rd-vp / vp-portal-issues

0 stars 0 forks source link

Updating Orphanet Codes in Database for VP-Portal #49

Open ammarbarakat opened 11 months ago

ammarbarakat commented 11 months ago

Describe your problem.

We rely on accurate and up-to-date Orphanet codes in our database to ensure the VP-Portal functions correctly. Currently, we face challenges in keeping our database synchronized with Orphanet's code updates. We need to implement a solution that allows us to automatically update our database when there are changes to Orphanet codes.

Proposed Solutions:

  1. Solution: API for Orphanet Code Updates We propose the creation of an API that provides us with real-time information about updates to Orphanet codes. This API should be capable of informing us about the following:

    • New Orphanet codes that have been added.
    • Orphanet codes that have been modified or updated.
    • Orphanet codes that have been deleted or deprecated. With this information, we can programmatically update our database to reflect the latest Orphanet codes.
  2. Solution: Hook Notification for Orphanet Code Updates Alternatively, we suggest implementing a hook or notification mechanism that informs us whenever there is an update in the Orphanet codes. This notification should be triggered automatically whenever Orphanet makes changes to its code repository. Upon receiving the notification, we should be able to download the updated Orphanet codes file directly. It is crucial that the provided Orphanet codes file is always up-to-date to ensure the accuracy of our database.

  3. Other Solutions: We are open to exploring alternative solutions that may address this issue. If you have any other ideas or recommendations for keeping our Orphanet codes up-to-date on our side, we will be more than happy to discuss them.

@Orphanet could you help us solve this issue?

HaddadTala commented 11 months ago

@Orphanet it seems that the main orphacode search is done via the flat file, can you advise on which solution from the above are best on the long-term?

Orphanet commented 11 months ago

We have currently different options (for different purposes/stakeholders)

The most usual are: www.orphadata.com => https://www.orphadata.com/alignments/ (several languages available) Updated: twice a year Format: XML Also available through github: https://github.com/Orphanet/Orphadata_aggregated

ORDO (Orphanet Rare diseases Ontology) https://www.orphadata.com/ordo/ (several languages) Updated: twice a year Format: OWL change log available : https://www.orphadata.com/data/ontologies/ordo/last_version/ORDO_releaseNotes_4.3_en.txt Also available through Bioportal https://bioportal.bioontology.org/ontologies/ORDO which provides also CSV, RDF/XML and a Diff (their own conversion)

"Nomenclature pack" https://www.orphadata.com/pack-nomenclature/ usage: patient coding purposes updated: Once a year (was mandatory to ease the yearly implementation in Health IT system) Format: XML Includes: - Master file (Excel file) (only the minimal set of ORPHAcodes, aligned with ICD-10 codes, that should be used for data sharing and statistical purposes at EU-level => should not be used for others purposes)

API: https://api.orphacode.org/ Updated once a year, same content than the "pack" https://api.orphacode.org/openapi.json (please not that we have /{lang}/ClinicalEntity/orphacode/{orphacode}/Status which allow to know the status of a code "active" or "inactive" )

https://api.orphadata.com/ updated twice a year, same content than www.orphadata.com https://api.orphadata.com/openapi.json

We probably need to have further discussion about update policies: 1) frequency/synchronicity of updates ? 2) how to manage resources / record which still use deprecated or obsolets codes at source (do we allow to search in the VP by ANY orphacodes or only the "active" ones) In that case, probably the vp don't need to care about the status of a given code)

For instance in Orphanet website we don't allow to search by deprecated codes or "orphacodes" which are no more considered as "rare diseases" because of a new prevalence estimation but was part of the knowledge base previously. Nevertheless those codes are still in our KB and never reused for something else.

Of cource, we can agree on a specific "format" or API more suitable for the dedicated usage into the VP. But we need to have a clear policy regarding the updates (management about the resources using deprecated codes, "realtime" or given "frequency") in that case.

ammarbarakat commented 11 months ago

@Orphanet Thanks a lot for the detailed response Marc! Based on the info you shared, it seems updating the XML files more often say every month makes the most sense. This way, we'll have a super up-to-date database and it seems like a smoother process for you to handle.