Closed bramiozo closed 1 year ago
It's not that limiting. Don't share the data with others. Obviously, others can access the data through the same mechanisms you do, and you can discuss your work with them. The principle is that you should not distribute the data to others.
It is fine to publish very high level aggregations of the data as this is essentially what a publication is. In the past, models were a part of this category (think a 10 coefficient logistic regression or relatively simple ML model). However the line is becoming blurrier with very high capacity models. Sharing of those is best done through the credentialed access mechanism, e.g. our work with a clinical large language model: https://physionet.org/content/clinical-t5/
P.S. Scientific research is broad. Plenty of industrial scientists do research. Developing models is a form of research.
Thanks for the clarification Alistair, through my academic glasses I projected non-existent hurdles :).
Ah the DUA is just a bit unclear in places. Happy to field the inquiry!
Can I ask for a clarification in relation to the physionet t&c? ; they state
How should I read this because it sounds extremely limiting. We cannot share and we cannot use it for anything other than scientific work? How does this apply to models?
Originally posted by @bramiozo in https://github.com/MIT-LCP/mimic-code/issues/850#issuecomment-1416776418