Standardize NCA and Modeling Result Comparisons

billdenney commented 5 years ago

As discussed in nlmixrdevelopment/nlmixr#119, there is interest by @vjd, @kestrel99, and @mattfidler to standardize cross-tool comparisons for both pharmacometric model fitting and NCA.

The PKdata repository was initially conceived to collect and assist with cross-tool NCA comparisons, so I thought an issue here was a good place to discuss the needs, benefits, organization, etc.

Notably, prior and current efforts to standardize NCA have also included the NCA consortium (https://github.com/NCAConsortium), POSSC (https://www.possc.org/), PhUSE (https://www.phusewiki.org/wiki/images/e/ed/PhUSE_CSS_WhitePaper_PK_final_25March2014.pdf), and others.

vjd commented 5 years ago

Thanks, @billdenney To begin with I suggest renaming the repo to something that captures more than just its contents (looking for suggestions).

The intent of the repo should technically be to test & benchmark open source development efforts versus results from commercial software and maintain this is a central hub for any such efforts.

We can start simple by just including PK/PD data, but this can be for other things pharmaceutical modeling and simulation related (maybe with a focus on clin pharm).

Everyone seem to use the PK and PD test datasets generated by the Monolix team. These datasets (especially the PK) are suitable for NCA analysis too.
We have permission from Marc Lavielle to host the datasets on github.
If not already completed, we can load the NONMEM/monolix results for analyzing these datasets using various ML estimation methods and generate a specific output format that allows testing
We can perhaps draft a complete testing plan that covers wide variety of requirements that different stakeholders may be interested in

vjd commented 5 years ago

I also suggest the organization of the repository is thought out carefully to allow multiple languages, with hopefully a feature down the line for CI testing

billdenney commented 5 years ago

@vjd, Your suggestions look great!

I'm not tied to the repository name, and naming isn't my strongest suit. (PMxData? PMxCompare?)
For the intent statement, I'd generalize that it's not to compare open source to commercial software but to compare the results of various software to each other for cross-validation, research, and discussion of benefits and limitations of the methods.
I'm not familiar with the Monolix datasets, but they sound like a good starting point.
For the output format that allows testing, ddmore has at least started that effort: http://www.ddmore.eu/projects/so-standard-output
For repository organization, I agree that it shouldn't be language-centric, it should be data-centric (and then allow export to common languages as sub-trees).
1. I'd think that there are classes of data types (e.g. dosing-concentration-time (PK), dosing-concentration-time-effect (PK/PD), and others. For the data format and organization, I think that SDTM, something SDTM-like, or ADaM may be best so that it's something many people are familiar with and so that we don't spend a lot of time reinventing a data format.
2. Then once the data structures are setup, something like a bindings directory could assist with building language-specific interfaces. (For example, a Makefile could be created that builds an R package, nlmixr models, NONMEM-ready data, Monolix-input data, etc.).
We also need to think about the license to use. This may be tricky if we use Monolix data or accept data from others (which I think we should do). My preference here would be that we use something mostly open perhaps with a requirement that if someone uses the data for a tool validation or comparison in a way that they publish or include in any materials outside of their organization, they must submit the validation results back to the repository in the standard format.

kestrel99 commented 5 years ago

I agree with all the above, and regarding a standard dataset format, I'll point you at the ISoP Data Standards initiative led by Andrijana Radivojevic: http://www.go-isop.org/data-standards-working-group. Its proposal for PK data standards in pharmacometrics is almost ready for release and represents 3+ years of work.

mattfidler commented 5 years ago

I agree; As far as the licence, probably a MIT licence or BSD licence would be OK, as long as monolix is Ok with the datasets being released.

vjd commented 5 years ago

I am not sure if a standard dataset format is feasible, that should be a long-term goal. What I was suggesting is the standard output format of the results obtained from the estimation routines or NCA analysis. That way, individuals can write their tests accordingly over time as long as the output standard is maintained. @MarcLavielle are you ok to release the monolix datasets on this open repository for comparisons against other software, preferably MIT license

billdenney commented 5 years ago

@vjd, I'm not sure that I understand your concern about standardizing the datasets.

I think that we will need standardized input formats so that multiple tools can automatically convert the datasets we provide to the format needed by the tool (otherwise, it will be onerous to test what we create). I think that we will need standardized output format so that comparisons of results can be automated.

I don't think that we will need to invent anything new-- we will just need to choose our standard(s) and ensure that we stick to them. (The ISoP data standard, DDMoRe, CDISC's SDTM and ADaM are all readily available and usable.)

mattfidler commented 5 years ago

@vjd I also don't understand what you are suggesting.

MarcLavielle commented 5 years ago

no problem!

Marc

Le lun. 7 janv. 2019 à 16:27, Vijay Ivaturi notifications@github.com a écrit :

I am not sure if a standard dataset format is feasible, that should be a long-term goal. What I was suggesting is the standard output format of the results obtained from the estimation routines or NCA analysis. That way, individuals can write their tests accordingly over time as long as the output standard is maintained. @MarcLavielle https://github.com/MarcLavielle are you ok to release the monolix datasets on this open repository for comparisons against other software, preferably MIT license

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/billdenney/PKdata/issues/1#issuecomment-451971483, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3yun4GthJ0iQUhgH0mAGGU74pwYKKZks5vA2dYgaJpZM4ZxZ-J .

billdenney / PKdata

Standardize NCA and Modeling Result Comparisons #1