MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.51k stars 1.5k forks source link

Can I ask for a clarification in relation to the physionet t&c? ; they state #1478

Closed bramiozo closed 1 year ago

bramiozo commented 1 year ago

Can I ask for a clarification in relation to the physionet t&c? ; they state


I will not attempt to identify any individual or institution referenced in PhysioNet restricted data.
I will exercise all reasonable and prudent care to avoid disclosure of the identity of any individual or institution referenced in PhysioNet restricted data in any publication or other communication.
I will not share access to PhysioNet restricted data with anyone else.
I will exercise all reasonable and prudent care to maintain the physical and electronic security of PhysioNet restricted data.
If I find information within PhysioNet restricted data that I believe might permit identification of any individual or institution, I will report the location of this information promptly by email to PHI-report@physionet.org, citing the location of the specific information in question.
I have requested access to PhysioNet restricted data **for the sole purpose of lawful use in scientific research, and I will use my privilege of access, if it is granted, for this purpose and no other.**
I have completed a training program in human research subject protections and HIPAA regulations, and I am submitting proof of having done so.
I will indicate the general purpose for which I intend to use the database in my application.
If I openly disseminate my results, I will also contribute the code used to produce those results to a repository that is open to the research community.
This agreement may be terminated by either party at any time, but my obligations with respect to PhysioNet data shall continue after termination.  

How should I read this because it sounds extremely limiting. We cannot share and we cannot use it for anything other than scientific work? How does this apply to models?

Originally posted by @bramiozo in https://github.com/MIT-LCP/mimic-code/issues/850#issuecomment-1416776418

alistairewj commented 1 year ago

It's not that limiting. Don't share the data with others. Obviously, others can access the data through the same mechanisms you do, and you can discuss your work with them. The principle is that you should not distribute the data to others.

It is fine to publish very high level aggregations of the data as this is essentially what a publication is. In the past, models were a part of this category (think a 10 coefficient logistic regression or relatively simple ML model). However the line is becoming blurrier with very high capacity models. Sharing of those is best done through the credentialed access mechanism, e.g. our work with a clinical large language model: https://physionet.org/content/clinical-t5/

P.S. Scientific research is broad. Plenty of industrial scientists do research. Developing models is a form of research.

bramiozo commented 1 year ago

Thanks for the clarification Alistair, through my academic glasses I projected non-existent hurdles :).

alistairewj commented 1 year ago

Ah the DUA is just a bit unclear in places. Happy to field the inquiry!