FDA / openfda

openFDA is a research project to provide open APIs, raw data downloads, documentation and examples, and a developer community for an important collection of FDA public datasets.
https://open.fda.gov
Creative Commons Zero v1.0 Universal
562 stars 130 forks source link

Open, and Linked, FDA data #5

Closed kerfors closed 4 years ago

kerfors commented 10 years ago

Excellent to see that OpenFDA features harmonization on identifiers as annotated strings http://open.fda.gov/api/reference/ How about http-based URIs as a next step? As part of applying the LinkedData principles, see http://kerfors.blogspot.se/2014/04/openfda.html for links to examples such as LinkedAERS and Linked SPL and to experts in W3C HCLS and Bio2RDF

seanherron commented 10 years ago

Hi @kerfors, it looks like most of the links on that page are broken. Do you have an example of how this would play out with the API Results?

kerfors commented 10 years ago

HI Sean, sorry to see that the @data2semantics links on the Linked AERS site are broken. I'll ping @RinkeHoekstra. However, the general idea is to assign a http-based identifier for each code (URI/IRI). So, beside providing a code as a string it would be great to also have a URI, prefereable URIs that have resolver services (eventhough some of the licenses make this hard), Health care data standard intitatives such as HL7 FHIR are now moving in this direction http://hl7.org/implement/standards/fhir/terminologies-systems.html And we are pushing back to clinical research data standard organisationer such as CDISC and MedDRA to provide persistent URIs. I propose that you get in contact with experts in linked data and semantic web for health care and life science (W3C HCLS) such as @micheldumontier, @egonwillighagen, Eric Prud'hommeaux and Charlie Mead.

RinkeHoekstra commented 10 years ago

Hi Kerstin and Sean,

Thanks for letting me know, we're working to get Linked AERS up and running again. Keep in mind that this is not a 'live' service. We would most gladly connect to the OpenFDA API instead.

Rinke

GeekNurse commented 10 years ago

@kerfors @RinkeHoekstra this site may be helpful to you if your simply wanting to search through the OpenFDA AERS data - ResearchAE.com - http://www.researchae.com/ Let me know if you have any questions.

kerfors commented 10 years ago

Hi @GeekNurse I'm pointing colleagues to your great user interface to query and visualise data using the brilliant openFDA API. My point to @seanherron is really the additional value you would get from having the identifiers as http-based URIs as a step towards 5 star Open and Linked data, Checkout http://5stardata.info/ and http://www.w3.org/TR/ld-bp/ cc: @BernHyland

kerfors commented 9 years ago

Great to see the "Enhancement" label on this issue. Any updates?

bewest commented 9 years ago

This may be a crazy tangent, but it's awesome to see FDA talking about LinkedData. In Richard Chapman's discussion of assurance cases, he talks about iterating, inspecting, and reasoning through recursive documents, and I immediately thought about LinkedData, and something like http://worrydream.com/TenBrighterIdeas/ as an editor/UI for these documents.

For my open source project, I've been asked to prepare a gap analysis, so I've started modeling a suite of documents roughly after the quality regulations themselves. http://process-controls.readthedocs.org/en/latest/index.html

As a thought experiment, are tools such as the openfda API an example of the MDDT regulations? Do the quality controls/regulations apply to tools such as these? If quality systems regulations were to apply to a project such as this, and you had to provide eg, a class III submission, would you used LinkedData to link audit reports to the quality controls, so that someone could traverse the data all the way to the regulations and back to the practical issue "at hand?" In theory github tracks a lot of usage trails, using the API, so reports/templates could automate a lot of the typically cumbersome (and expensive) work of preparing appropriate documentation. If class III controls applied to a project like this, would you eg create rst/markdown output from the test runners to render reports? Would they link or be linked to in some special way to or from other reports? With LinkedData, in theory, someone looking at MDRs/maude could track trending issues all the way back to reports on how things were fixed, or handled. Would the regulations themselves also need to be linked data?

As a practical matter, I've chosen sphinx/rst/markdown so I can easily re-theme, re-engrave, re-render, and version documents with as many or as few "links" in them as needed. As an even more practical matter, I would love to figure out a way to cite or link to regulations, and if the project starts submitting MDR reports, how to best interlink between everything.

Lot of questions in there, from philosophical to practical, and perhaps only tangentially related to the thread, sorry for the noise if that's the case.

westurner commented 9 years ago

Could there be a JSONLD @context?

westurner commented 9 years ago

Could there be a JSONLD @context?

  1. [ ] Create/generate a JSON-LD @context
  2. [ ] Annotate with the relevant @context attributes
    • (EDIT: this would need to be added to the data pipelines)

Docs:

westurner commented 9 years ago

https://github.com/FDA/openfda/tree/master/schemas

I'm not aware of a general approach for ElasticSearch JSON to/from JSON-LD; but it should be relatively easy to do.

These are often helpful:

westurner commented 9 years ago

I wrote a tool to generate approximate JSON-LD @contexts from (these) ElasticSearch mappings: https://github.com/westurner/elasticsearchjsonld/blob/master/elasticsearchjsonld/elasticsearchjsonld.py

The output JSON-LD @context schema (.jsonld) are here : https://github.com/westurner/openfda-jsonld-testing/tree/gh-pages/ns

Not sure how to test these

westurner commented 9 years ago

I specified the vocabulary prefixes as http://open.fda.gov/ns/${x}# here: https://github.com/westurner/elasticsearchjsonld/blob/master/scripts/build_openfda_jsonld_contexts.sh

westurner commented 9 years ago

To make these more useful, there could be mappings e.g. to URNs/URIs that would need to be manually added to the JSON-LD @contexts (e.g. http://schema.org/docs/meddocs.html )

See "TODO" here for broader health informatics #LinkedData context: https://westurner.github.io/opengov/us/#health

westurner commented 9 years ago

http://schema.org/docs/meddocs.html

The schema does provide a way to annotate entities with codes that refer to existing controlled medical vocabularies (such as MeSH, SNOMED, ICD, RxNorm, UMLS, etc) when they are available.

bewest commented 8 years ago

To add some more color to this, after reading https://medium.com/@chrishannemann/measure-seventy-five-times-cut-once-further-blood-glucose-meter-testing-9e769a853710, I was inspired to mock up a way for people, (citizen scientists) to contribute pair-wise readings from glucometers in order to aide post-market surveillance.

To make this easy, I noticed that openfda published device registrations and listings, and thought this might be a good way to automatically populate a list of meters for users to choose from. However, it's not clear to me how common labeling might be linked directly to attributes of a particular device, or how the device registrations overlap with what people experience in the market.

For example, while it appears I can search for eg, OTC glucose meters by restricting for regulation_number:862.1345, which is not quite enough.

Here's a quick hello-world demo using FDA's registration database to provide an easy way to select a glucometer that is sold in the US. optimised

Once chosen though, I'm having trouble finding a unique identifier that would identify only the selected device, and as a nice to have, it'd be lovely to find links to additional media/labeling connected with the approval/device.

HTH see what people can start to do with this very very cool API. It's been wonderful to see more and more data added and start to be able to integrate against this.

westurner commented 8 years ago

On Apr 8, 2016 10:25 AM, "Ben West" notifications@github.com wrote:

[...].

Once chosen though, I'm having trouble finding a unique identifier that would identify only the selected device, and as a nice to have, it'd be lovely to find links to additional media/labeling connected with the approval/device.

There may be opportunities for linking with schema.org RDF / JSON-LD:

AFAIU, feedback/suggestions/(PRs) for e.g. schema:MedicalDevice should be added to (or referenced w/ #.492).

Potential data publishers here:

IIUC, your need implies a need for (RDF(a), JSON-LD) linked data mappings between { device codes , urls , [?] }.

westurner commented 8 years ago

And then from the draft schema.org health-lifesci RDFS extension, there are also MedicalTrialDesign subclasses

from https://github.com/schemaorg/schemaorg/issues/492 w/ extra markdown list indentation here :

westurner commented 4 years ago

@beardedfinch Why did you close this issue?

On Friday, October 4, 2019, Jack Finch notifications@github.com wrote:

Closed #5.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

violetcrestedwren commented 4 years ago

Hi Wes,

I was going through and removing some old issues that no longer seemed relevant. If this is still relevant let me know and the team will take a look.

westurner commented 4 years ago

@beardedfinch et aI.

We can maximize the utility of this and other FDA datasets by using URIs as identifiers and using or creating RDFS vocabu laries with URIs for each "column" of each Dataset.

Elasticsearchjsonld is one way to generate a JSONLD @context (akin to RDFS schema) from an existing elasticsearch mapping schema.

When I search for "FDA linked data", I find mentions of universal device identifiers for use with EHRs: https://www.healthdatamanagement.com/news/fda-sees-benefits-of-linking-universal-device-ids-to-ehrs

There's also the FDA DSCSA pharmaceutical blockchain pilot program where e.g. JSONLD (or any other RDF linked data representation) would be very helpful for data integration with industry and international datasets.

Use cases for linked data for integration with this dataset:

You might argue that this issue should be closed because there is currently no FDA effort to publish this or other datasets as linked data. Or, it could be argued that this issue should remain open precisely because there is no other industry effort to enable data integration with linked data.

To whom at FDA should the strong case for linked data presented by e.g. https://5stardata.info and https://lod-cloud.net/ be directed?