PEtab-dev / libpetab-python

Python package for working with PEtab files
https://libpetab-python.readthedocs.io
MIT License
14 stars 5 forks source link

Simulator: rename measurement column to simulation #199

Closed dilpath closed 1 year ago

dilpath commented 1 year ago

What's your opinion on whether to default to measurement or simulation? Either is fine for me. Currently I am in favor of keeping measurement, because:

However, I see that simulation might be what users expect from a simulator.

see also: AMICI-dev/AMICI#2050 and AMICI-dev/AMICI#2051

codecov-commenter commented 1 year ago

Codecov Report

Merging #199 (3acdeef) into develop (1964ca1) will increase coverage by 0.03%. The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #199      +/-   ##
===========================================
+ Coverage    76.25%   76.28%   +0.03%     
===========================================
  Files           34       34              
  Lines         3155     3159       +4     
  Branches       764      765       +1     
===========================================
+ Hits          2406     2410       +4     
  Misses         552      552              
  Partials       197      197              
Impacted Files Coverage Δ
petab/simulate.py 82.45% <100.00%> (+1.32%) :arrow_up:
dweindl commented 1 year ago

However, I see that simulation might be what users expect from a simulator.

That.

* I have mostly used this for synthetic data generation

Maybe we should split it into a Simulator and a SyntheticDataGenerator, each with their natural default?

* It won't break user code

Fair point, but I'd say it is not so widely used yet, that I'd rather avoid confusing future users.

I would change rename to as_measurement or as_simulation, depending on the final choice of the default.

fbergmann commented 1 year ago

Looks good to me, though i of course would prefer to have the default for rename set to True, since I'd mainly expect a simulation dataframe when calling simulate.

dilpath commented 1 year ago

Thanks for the feedback! I opted to default to as_measurement=False, and otherwise keep it as-is.

The simulator now enforces the column name, such that the tool (e.g. AMICI/COPASI) naming choice of 'measurement' or 'simulation' doesn't affect the output here.

dilpath commented 1 year ago

Would be good to specify/document what simulate_without_noise is supposed to return - measurement or simulation.

Hm, at the moment the PEtab spec states

(optional) A simulation file, which has the same format as the measurement file, but contains model simulations [TSV]

But this looks like a typo, since the simulation column name is used in practice.

Currently, the docstring for simulate_without_noise specifies measurement (as in the spec):

Simulated data, as a PEtab measurements table, which should be equivalent to replacing all values in the :const:petab.C.MEASUREMENT column of the measurements table (of the PEtab problem supplied to the :meth:petab.simulate.Simulator.__init__ method), with simulated values.

But the simulate method now enforces a specific column name, so the user always sees consistent column names.

To me it would feel most natural to have simulation for no-noise, and measurement for with-noise. I don't know if this might be considered confusing by others.

Currently, as user can choose the column name, with as_measurement. Are you suggesting that I remove as_measurement and instead use Simulator.simulate(noise=True/False) determine the column name? The user is never meant to use simulate_without_noise, rather use Simulator.simulate(noise=True/False).

dweindl commented 1 year ago

Would be good to specify/document what simulate_without_noise is supposed to return - measurement or simulation.

Hm, at the moment the PEtab spec states

(optional) A simulation file, which has the same format as the measurement file, but contains model simulations [TSV]

But this looks like a typo, since the simulation column name is used in practice.

Agreed, should be changed.

Currently, the docstring for simulate_without_noise specifies measurement (as in the spec):

Simulated data, as a PEtab measurements table, which should be equivalent to replacing all values in the :const:petab.C.MEASUREMENT column of the measurements table (of the PEtab problem supplied to the :meth:petab.simulate.Simulator.__init__ method), with simulated values.

Sorry, I missed that. I accidentally looked at TestSimulator, which didn't have an informative docstring. All good.

To me it would feel most natural to have simulation for no-noise, and measurement for with-noise. I don't know if this might be considered confusing by others.

Currently, as user can choose the column name, with as_measurement. Are you suggesting that I remove as_measurement and instead use Simulator.simulate(noise=True/False) determine the column name? The user is never meant to use simulate_without_noise, rather use Simulator.simulate(noise=True/False).

No, I think it's clearly specified then.