Medical Software Industry and IPFS

lanzafame commented 6 years ago

Status: DRAFT

So for context, I just left a job working for an Australian practice management software vendor, and during my time there I was involved in a technical working group, which exposed me to some of the industries technical and political issues. What follows is a rough draft of issues and soon to be added to potential solutions.

The medical software industry has many issues, below are a few that I see as critical to the industry to moving forward:

Technical Issues

secure, reliable storage of patient data
consistent implementations of actually interoperable protocols
historically, very closed (read: proprietary) industry with rampant vendor lock-in
secure, reliable exchange of information (read: messaging)
historically, slow to adopt new technology, usually this is due to high development costs, though not always
secure identity infrastructure that is open usually needs to be provided for free by an 'unbiased', authoritative organization, i.e. government
protocols and standards are extremely difficult to, firstly, implement, secondly, upgrade and MUST be backward compatible
large amounts of unversioned almost duplicated patient data

Ethical Issues

a patient's data is sharded across the multiple practices and hospitals that they visit in their lifetime, with 'no real' avenue[^1] for the aggregation and distribution as a consistent volume. A patient, with everything that being a human, entails, biased listening, poor recall, etc, is required to be THE source of truth, even though they are rarely provided with any actual data, only spoken diagnosis. This sharding leads to several issues, i.e. doctor shopping (the bad kind, see wiki) is easier as patients get to choose exactly what they share (read: hide information), doctors are able to make changing doctors difficult by not releasing a patient's charts.

Political Issues

There is a mine field of political issues in healthcare, and it is easy to step on people's toes. Just something to be aware of. YMMV.
Government buy-in can be critical or required for progress to be made.

Privacy/Security Issues

A technical solution may need to contend with standards like HIPAA
Several countries require that the personal information of its citizens is stored on servers located in that country. Especially, in the case of health-related information.

SOLUTIONS

Multiformats

The medical software industry suffers from an abundance of protocols and standards, both open and proprietary, at different levels of the software stack. Backward compatibility is a significant priority in the industry, and as such adoption of new or updated protocols/standards is historically slow as without easy, cost-effective, upgrade paths most businesses don't see the benefit.

This is where I see multiformats, with its self-describing nature, being of great benefit. Here is a list of reasons/benefits:

increased interoperability as exact versions/formats can be determined at runtime
- this results in less coordination between organizations beforehand, which is a big win in the health tech space 👍 👍
easier to define programmatic upgrade and downgrade paths
the format of stored binary data can always be determined (i.e when data has been retrieved from a REST API and stored, the version of the API, which will usually indicate the format of the payload, isn't usually found in the payload itself)

Data Ownership + ACLs

The 'cloud' age, according to many in the health tech space is meant to result in patients having greater access to their records and therefore be able to access better healthcare (practitioners will have access to better data), or cheaper healthcare (patients will be able to shop around for cheaper healthcare providers without the difficulty of dragging around their printed out medical records from their previous doctor).

I believe this is false, all the 'cloud' age is doing for patients is moving their data from the server in the backroom of their doctor's practise and into a database owned and operated by some SaaS provider that may or may not go bust (leaving the patient without the records that were entrusted to them). At a Hacking Health event last year, two-thirds of pitches put forward the concept of a cloud-based patient data consolidation app, which was put simply, a nice UI put over what would need to be a constantly expanding integration engine.

An alternative solution to this is to use IPFS as a data layer that is shared by all healthcare related applications. A patient would have a globally unique ID that linked to their medical records (CRDT??). The technical difficulties involved in this approach relate to how a patient would grant and revoke access to specific 'sub-trees' of their records (i.e. your chiropractor doesn't need access to your blood results, just your spinal x-ray). What identities to recognize (i.e. individual, organizations, organizations representing individuals, services provided by organizations, individuals providing services, etc) and how they are able to interact is a complicated aspect of this approach (though not unique to this approach).

Identity Scenarios

Contracted Service Provider

                                          ┌────────────────────────────────┐ 
                                          │ Doctor2 uses a Cloud Practice  │ 
                                          │Management Software (PMS), which│ 
                                          │means that Doctor2 has entrusted│ 
                                          │  them to receive messages on   │ 
                                          │their behalf from other doctors │ 
                                          │      and pathology labs.       │ 
                                          └────────────────────────────────┘ 

     ┌────────────┐                      ┌─────────────┐       ┌────────────┐
     │            │      send msg        │             │       │            │
     │  Doctor1   │─────────────────────▶│  Cloud PMS  │──────▶│  Doctor2   │
     │            │     to Doctor2       │             │       │            │
     └────────────┘                      └─────────────┘       └────────────┘

┌───────────────────────────────────────┐                                       
│How does Doctor1 establish that Doctor2│                                       
│has entrusted Cloud PMS to receive     │                                       
│messages on their behalf?              │                                       
│                                       │                                       
│Some constraints on how:               │                                       
│                                       │                                       
│- Doctor2 CANNOT gift their secret key │                                       
│to Cloud PMS                           │                                       
│                                       │                                       
│- Doctor2 SHOULD be able to leave Cloud│                                       
│PMS and Cloud PMS SHOULD NOT be able to│                                       
│handle messages of Doctor2             │                                       
└───────────────────────────────────────┘

Side note: the Cloud PMS, if they are to stay in business long, will have many customers, in networking terms, they represent a router in the messaging system between doctors but only inbound.

Patient gives access to a subtree of their data //TODO

//TODO: aspects yet to cover: ipfs-cluster + ~some form of geographical scoped tagging or the use of country-specific ipfs networks

[^1]: a patient is allowed to request a doctor to export their records but the doctor has a 'right' (in Australia) to charge the patient a 'processing' fee.

jiangbubai commented 6 years ago

sounds great！I am looking foward your update.

pgte commented 6 years ago

@lanzafame thank you for this write up, it's really enlightening!

Identity

How does Doctor1 establish that Doctor2 has entrusted Cloud PMS to receive messages on their behalf?

One solution for this is to use a session key-pair. The Cloud PMS acts like a user agent to Doctor 2, and so Doctor 2 can create a session key-pair and sent it to the Cloud PMS. This session key works like a token, and can be used to authenticate Doctor 2 to Doctor 1 when the last is talking to the Cloud PMS. This key should be revokable.

This mechanism is being developed here in this Identity Management RFC: https://github.com/ipfs-shipyard/peer-star/pull/15

Access Control

A patient would have a globally unique ID that linked to their medical records (CRDT??). The technical difficulties involved in this approach relate to how a patient would grant and revoke access to specific 'sub-trees' of their records (i.e. your chiropractor doesn't need access to your blood results, just your spinal x-ray).

Here, an ACL over a CRDT would be a solution. This would prevent writes without the ACL to be accepted.

Preventing reads is more complex:

Preventing reads of the data in the past is hard, as, even if the permission is revoked, the user already got that data. What the software can do is enforce the deletion of the data once the access is revoked (to be compliant).

Preventing reading the data in the future can be done with key rotation (creating a new key and only delivering it to the right people).

Any other options you see?

/cc @satazor @joaosantos15

lanzafame commented 6 years ago

One solution for this is to use a session key-pair.

@pgte haha yes this would work perfectly, reading this, I realised I was still holding on to several 'bureaucratic' restrictions that would have prevented this solution. 🤦‍♂️

Preventing reads of the data in the past is hard

Ideally, this would be possible, but in comparison to what is currently the standard practice across the industry, I believe that most patient would be ecstatic with the knowing when and who is accessing their data and being able to deny them access if they chose to.

This may be naive, but is key rotation required for future data, if the permissions are tied to a specific subtree, any updates, i.e. adding a new pathology result, would result in a new subtree hash and so the party with authorization to read v1 of the subtree, won't have access to v2 of the subtree until the patient grants them access. Also, I feel there is a place here for IPLD, in particular, a FHIR version of IPLD to give the semantic meaning to what is being permissioned, i.e. give Doctor1 access to the blood tests that were requested by Doctor1 but not those requested by Doctor5. But I think that is probably complicating the simpler issue of granting and revoking access at this point in time.

Thanks for the feedback 👍 I will keep plugging away at more use cases.

lanzafame commented 6 years ago

@pgte @satazor @joaosantos15 I highly recommend you all join the FHIR Zulip chat and have a read of the conversation here regarding the use of DIDs and other forms of identities and where the medical industry is at political and why it isn't moving forward. I realise this is way outside of scope of your WG but I think it provides a lot of food for thought. For context, Grahame Grieve is the original author of the FHIR spec and though he may seem like he is on the neigh side of the discussion, he is just being pragmatic about what is required for these forms of technology to become reality. Doug Bulleit is behind a project called FHIRBlocks, something that I envision this project (medical data on ipfs/ipld) out competing (personally, I don't think a blockchain by itself suits the medical industry as it has too much data). But either way, it is a very enlightening discussion around identities.

pgte commented 6 years ago

@lanzafame Regarding the identity discussion, I believe that some solutions for self-sovereign identity have the required privacy requirements, by keeping data in the user, off the block chain. The problem I see is that they require user consent, which may not always be possible in all contexts (user may be unconscious, uncooperative, etc.). This is a tough problem, no doubt, that needs more thought.

Regarding the "read data" permission system:

If I'm seeing this correctly, the only way for data to not be accessible to an actor that has no permission, is to encrypt it with a key that this principal doesn't know. So, for past data, it's impossible to be assured that the actor won't be able to access data that it has already seen.

As for future data, yes, we could partition the data in a way that all the new data entering a patient record requires a new key.

Self-sovereign identity

The other option (that goes along the lines of self-sovereign identity) is for the patient to hold all the data, and then, when a copy of it is required by a practitioner, it's sent encrypted with an exchange-specific session key and sent from the user to the practitioner. So a new copy of the data could be sent. Also, that data could be time-stamped, annotated and signed by the patient before being sent. This way, when a doctor presents the data to a third party, that third party can check whether the doctor has permission from the patient to relay that data. If not, they should discard that data as to not be liable to be infringing the law. This could be easily automatically enforced by the software.

If so, a permissioned patient data block would contain:

subject:
- the patient DID
- the claim itself (patient data)
- the author (practitioner) DID
- the author (practitioner) signature
permission:
- target DID (the target practitioner / identity)
- permissions verb given by user (read-only, transmit to others, etc.)
- time of creation
- expiration date
- parent permission (for delegation purposes)
the patient signature of all the above

and it would be encrypted with the target DID public key.

@lanzafame I'm not sure if there is already a standard in SSID or FHIR for this... @joaosantos15 ^^do you think this makes sense, am I missing something?

Data creation

The data creation (by a practitioner) would be simpler in this self-soveregn framework:

The practitioner would create the claim, add authorship info, sign it, encrypt it with the patient public key and send it to the patient. The patient would then store it securely for later use.

Discussion:

The alternative to using a self-sovereign solution is for the data to exist publicly (somehow), but still encrypted for privacy reasons. The encryption of the patient data has to be done by the patient with a key that is then transmitted to an actor that the patient wants to give read permissions to. The problem is that, to revoke read permission to this, you will need to rotate the key and encrypt that data with a new key and transmit it again to the actors.

As data changes, they could be encrypted with a key that is being renewed (a key would only work for a specific amount of time, expiring later).

I think the self-sovereign solution is more robust than this last one..

@lanzafame thoughts?

joaosantos15 commented 6 years ago

Giving cryptographic consent when the patient is unconscious

@pgte

The problem I see is that they require user consent, which may not always be possible in all contexts (user may be unconscious, uncooperative, etc.). This is a tough problem, no doubt, that needs more thought.

We could shard a secret by n participants. By having the consent of m participants (m<n) the secret could be recovered. The n participants would be trustees of the patient (spouse, family, doctor, friends).

Self-Sovereign Identity

@pgte That sounds great. I'd only add that the step

and it would be encrypted with the target DID public key.

could be replaced by encrypting it with a specific key for that purpose, not necessarily the recipient's DID key. There are two reasons for this: First, the recipient is already made explicit in the permission.targetDID field. Second, there could be a scenario where you want to send the data to a endpoint which may not be directly controlled by the recipient (e.g. you send the data to the hospital where the recipient, a doctor, works. The hospital's software decrypts it, but only the doctor is allowed to use the data, or re-share it).

About the “read data” permission system

I agree with your comment,

The problem is that, to revoke read permission to this, you will need to rotate the key and encrypt that data with a new key and transmit it again to the actors. ... As data changes, they could be encrypted with a key that is being renewed (a key would only work for a specific amount of time, expiring later).

Adding to this, you would get no way of controlling who was accessing the data, unless you rotated the keys everytime you shared records with a new entity.

The problem of who owns the records

Here, we are assuming that the patient's own their own medical records. I believe that in Portugal, for instance, medical records are the doctor's property. So there is a question of whether the patient is even authorized to perform access control over them.

@lanzafame thank you for the link to the chat about FHIR, I'm reading it now and it's really interesting. Finally, a place where blockchain is not seen as the only, or even optimal, solution for healthcare 🙏.

pgte commented 6 years ago

@joaosantos15

We could shard a secret by n participants. By having the consent of m participants (m<n) the secret could be recovered. The n participants would be trustees of the patient (spouse, family, doctor, friends).

Good idea, that would work for many more cases. One problem I see though: does not work for "John Doe" patients: patients that arrive that are uncooperative and that you have no information about. I think this could only be solved by using biometrics, chip implants, a paper key, etc..

and it would be encrypted with the target DID public key.

could be replaced by encrypting it with a specific key for that purpose, not necessarily the recipient's DID key. There are two reasons for this: First, the recipient is already made explicit in the permission.targetDID field. Second, there could be a scenario where you want to send the data to a endpoint which may not be directly controlled by the recipient (e.g. you send the data to the hospital where the recipient, a doctor, works. The hospital's software decrypts it, but only the doctor is allowed to use the data, or re-share it).

Hmm, I don't see a need for this, correct me if I'm wrong: The doctor decrypts the data coming from the patient with its own private key. If they require to send it to another party, they can:

add another permission, with the parent permission pointing to the original patient permission block.
encrypt and sign the entire permissioned patient data block with the recipient key.

This way, this final recipient would be able to:

authenticate the sender (the doctor, in this case)
verify that the patient allowed this data to be propagated. If not, discard it (to conform to legal requirements).

@joaosantos15 makes sense?

pgte commented 6 years ago

Here, we are assuming that the patient's own their own medical records. I believe that in Portugal, for instance, medical records are the doctor's property. So there is a question of whether the patient is even authorized to perform access control over them.

Good point. @lanzafame any insight on this?

satazor commented 6 years ago

Hello!

Have you guys looked into https://github.com/decentralized-identity/hubs/blob/master/explainer.md?

This looks very promising and it would probably solve the storage and permissions part. There's actually an example of a medical record being stored by a doctor into the hub of a patient.

Note that this is a specification and may be implemented with any technology. We could definitely build an identity hub on top of ipfs.

joaosantos15 commented 6 years ago

@pgte

Hmm, I don't see a need for this, correct me if I'm wrong: The doctor decrypts the data coming from the patient with its own private key. If they require to send it to another party, they can:

add another permission, with the parent permission pointing to the original patient permission block.

encrypt and sign the entire permissioned patient data block with the recipient key.

This way, this final recipient would be able to:

authenticate the sender (the doctor, in this case)

verify that the patient allowed this data to be propagated. If not, discard it (to conform to legal requirements).

But would that work in practice? If a lot of people need access to the data, wouldn't that require too many individual actions from the doctor?

For instance, if you're admitted to the ER with what appears to be an abdominal trauma, the doctor will probably request a CT scan. The CT scan technician will also need access to your medical history to see if you're not allergic to CT scan contrast. The original doctor might order several more exams, and each exam's technician will need access to your medical history.

So, in this case, the doctor would be required to sign and encrypt a large number of separate authorizations. It would probably be simpler to have an authorization per hospital, rather than per doctor.

lanzafame commented 6 years ago

@pgte @joaosantos15

Second, there could be a scenario where you want to send the data to a endpoint which may not be directly controlled by the recipient (e.g. you send the data to the hospital where the recipient, a doctor, works. The hospital's software decrypts it, but only the doctor is allowed to use the data, or re-share it).

This is actually a really common practice, especially in the case of communication between doctors and other healthcare service providers, i.e. hospitals, pathology clinics, radiology clinics, and referrals of patients by doctors to specialists. The main reason is that doctors leave hospitals and practices, and it isn't expected that patients follow a doctor around, hence when a doctor sells a practice, the database of whatever medical software they used, is stated as part of the sales agreement, which includes all the patients' information.

lanzafame commented 6 years ago

I should mention that the whole ER/John Doe scenario is a straw man in terms of an edge case. If you have someone coming into ER that is requiring immediate attention, no doctor is looking up and reading the patients records, they are treating the patient according to symptoms that are presenting themselves. As it stands the current systems don't have an answer for this and it isn't the problem that we want to solve.

Here, we are assuming that the patient's own their own medical records. I believe that in Portugal, for instance, medical records are the doctor's property. So there is a question of whether the patient is even authorized to perform access control over them.

Good point. @lanzafame any insight on this?

Doctor's would no longer own the records. This is very much a patient-centric system that would put the patient in the centre. And relegate doctors to 'service-providers' and giving the ability for patients to shop around (not in the doctor-shopping sense, but the no vendor lock-in sense).

pgte commented 6 years ago

@lanzafame

This is actually a really common practice, especially in the case of communication between doctors and other healthcare service providers, i.e. hospitals, pathology clinics, radiology clinics, and referrals of patients by doctors to specialists. The main reason is that doctors leave hospitals and practices, and it isn't expected that patients follow a doctor around, hence when a doctor sells a practice, the database of whatever medical software they used, is stated as part of the sales agreement, which includes all the patients' information.

In this case I would see that the patient sends that data to the hospital, who then sends it to the doctor upon request (can all be automated). The thing here is that the patient maintains ownership of the data, and the hospital is leasing the ownership to the doctor for a brief period of time.

lanzafame commented 6 years ago

just making note of this: https://www.healthdatamanagement.com/news/vanderbilt-leverages-blockchain-fhir-for-secure-sharing-of-medical-records https://www.dre.vanderbilt.edu/~schmidt/PDF/FHIRChain-jnca.pdf

lanzafame commented 6 years ago

@jonnycrunch

jonnycrunch commented 6 years ago

@lanzafame cool, thanks!

jonnycrunch commented 6 years ago

Sorry for the delay... just getting caught up on my backlog of conversations. Yes, super cool! I agree huge potential. However, I have come to realize that healthcare is slow to adopt new technologies. I see the barrier ultimately being crypto key management. It also boils down to ownership of the data. Even if the doctor wrote the note, after she/he leaves the practice, does the doctor even maintain access to the data .. not to mention the patient. yet, I am obviously bullish on the applications of DID spec.

lanzafame commented 5 years ago

Some other work in this space: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8717579

shumy-tools commented 5 years ago

As a person researching on healthcare solutions (mainly on PACS/DICOM) I believe that access control and tracking is a must on such a system. However, an ACL by itself cannot provide the functionality for all the use-cases. I would be a supporter of a Service Provider Interface (SPI) in a similar way as the Tendermint ABCI, trying as much as possible abstracting the access control mechanism.

Also, one can encrypt without requiring a key (SSS), supporting any break-the-glass requirements. However, this would be useless without a proper access control.

Moreover, I don't know if this is already possible but, support for file slicing or some form of file aggregation in order to support future implementations of DICOM Whole Slide Imaging (WSI) would be nice too.

lanzafame commented 5 years ago

Moreover, I don't know if this is already possible but, support for file slicing or some form of file aggregation in order to support future implementations of DICOM Whole Slide Imaging (WSI) would be nice too.

This is actually fairly straightforward and technically already occurs in IPFS with how it chunks files, but it would most likely be in a non-optimal way for DICOM files types currently.

Re: ACL, I agree with what you are saying. I am not sure whether SSS, in the case where the patient(user) has full control of their data, is viable but I haven't really thought about it.

ipfs / notes