Open-Systems-Pharmacology / PK-Sim

PK-Sim® is a comprehensive software tool for whole-body physiologically based pharmacokinetic modeling
Other
103 stars 50 forks source link

Merge Populations #2613

Open msevestre opened 1 year ago

msevestre commented 1 year ago

How can we implement this feature? This is not as easy as I thought.

So, in short, a MergePopulation will only combine the existing population, but we won't allow editing much of its property. WHY, so that we can recreate from snapshot.

Feedback @PavelBal @StephanSchaller @Yuri05

msevestre commented 1 year ago

PS why do we need this? because we will create 3 different population for HI (Score A to C) and we can combine them that way

Yuri05 commented 1 year ago

only common expression profiles will be added automatically. Otherwise, they will be removed

I think we should not remove expression profiles, because this would change existing populations within the merged population. So I would suggest to stop merging in such a case and suggest to the user to create new const Zero expression Profiles and add them to all populations where the corresponding proteins are missing; after that the merging could be performed.

Yuri05 commented 1 year ago

A view will be shown listing the selected population as well as the base individual that will be used for this

I wonder if we should only allow merging of populations which where created based on the same individual.

My expectation is always: merging populations does not change any of the population properties. But if e.g. one of the original populations was created based on individual I1, and my merged population is suddenly created based on the individual I2 - this would change some properties of the individuals within the original population - which would be completely wrong in my opinion.

So I would suggest that in the "Merge populations" view no base individual has to be selected. Instead, it should be checked if all selected populations are based on the same individual. If yes - OK to merge. If no - merging not possible.

Yuri05 commented 1 year ago

General question: is merged population a COPY of used populations or just a bracket around them? So if one of the used populations is modified after merging: does this impact the merged population?

msevestre commented 1 year ago

So I would suggest that in the "Merge populations" view no base individual has to be selected. Instead, it should be checked if all selected populations are based on the same individual.

well this is exactly why we want to do. 3 populations based on 3 different individuals (A, B, and C). Otherwise, why would we want to merge populations in the first place?

I'd love to say it's a copy. That would be much easier. But then no snapshot...

msevestre commented 1 year ago

so unless I am mistaken, we need to be able to use different individuals If someones changes a population after the fact, this will also create problems. Like adding an expression profile or removing it etc...

Yuri05 commented 1 year ago

I'd love to say it's a copy. That would be much easier. But then no snapshot...

Everything else (besides the snapshot) can be achieved already now by merging via CSV export/import. And in the "csv-merged" population we can also add/modify user defined variability, add expressions etc.

So if it's a copy: then I would suggest to save the snapshots of all input populations as part of the merged population and export all of them into the project snapshot. Then the merged population still can be recretaed even if the input populations were changed after merge.

(By the way: if a user defined variabilty was added in one of the input populations: it's displayed as "unknown" distribution in the "csv-merged" population, which is quite nice. I never tried it before :) grafik

Yuri05 commented 1 year ago

A completely different approach to think about: instead of merging populations - allow selection of multiple populations when creating a simulation (with the same restrictions as discussed above: e.g. only populations with the same expression profiles can be selected etc.)

msevestre commented 1 year ago

So if it's a copy: then I would suggest to save the snapshots of all input populations as part of the merged population and export all of them into the project snapshot. Then the merged population still can be recretaed even if the input populations were changed after merge.

this simple features becomes way too complicated. Right now,we need to simplify or we won't ever ship

msevestre commented 1 year ago

I really only want the Merge Population to be a parenthesis over existing populations with restrictions. Maybe population that are part of a merge become readonly and cannot be edited. Or, like for CSV imported pop, we do not support snapshot for them

Yuri05 commented 1 year ago

The simplest solution: create a "csv-merged" population. Thus in the background just export all input populations to temporary csv and reimport them into one merged pop in the same way as it's currently done for the pop csv import. The only gain: user does not need to export/import populations by hand.

Then no restrictions on the merged pop: can modify advanced distributions, expressions etc.

msevestre commented 1 year ago

I still believe there is a case to be made for a reference only to populations that is restricted in some ways. Or we do indeed the same as CSV import .We lose completely the connection. We are on the first individual only For snapshot, we could theoretically also export the pop as csv relative to the snapshot file and reference it in the snapshot//

msevestre commented 1 year ago

I have been thinking about it and I think merged population is not a good idea after all What we could use, is a way to split population results instead So here image

What could be done is simply to allow for proportions of populations. I have 3 simulations, say HI A, HI B, and HIC And I want to reproduce a study with 10% A, 30%B, 60%C

We can add another column to this view (default is 100%) where we specify the proportion of EACH that we want to take In the Comparison, we would need to save this percentage, as well as the id of all individuals selected randomly (if I have 100 ind and I take 10%, we would save 10 ids).

This would allow us to merge populations OUTPUT. That way we do not care if they have the same user variability, enzymes etc..

msevestre commented 1 year ago

So we need 2 more columns (Percentage, editable) and (Count = total * percentagE) Where we have the name we could add the number of ind as well pop1 (n = 1500)