PEtab-dev / libpetab-python

Python package for working with PEtab files
https://libpetab-python.readthedocs.io
MIT License
14 stars 6 forks source link

Unclear what is wrong with PEtab problem after running linter #123

Closed matthiaskoenig closed 2 years ago

matthiaskoenig commented 2 years ago

Hi all,

I am unclear what is wrong with my PEtab definition. I am not sure where the best place to ask is, so if this is the wrong issue tracker and there is a better place for practical problems with creating PEtab let me know.

When running the linter over my files (attached below) I get the errors below.

icg.zip

petablint -y /icg.yaml 
Mismatch of noise parameter overrides in:
observableId                                     model__ER_icg
preequilibrationConditionId                                NaN
simulationConditionId          Caesar1961_task_icg_inf_Control
datasetId                                     Caesar1961_dset0
replicateId                                                 R1
time                                                     200.0
measurement                                               0.62
noiseParameters                                       0.058523
observableParameters                                       NaN
Name: 21, dtype: object
Expected 0 but got 0
Condition table contains column for unknown entity 'Unnamed: 0'. Column names must match parameter, species or compartment IDs specified in the SBML model.
Not OK
dilpath commented 2 years ago

Attached is a PEtab problem that passes linting [1]. Changes are:

[1] icg2.zip [2] https://petab.readthedocs.io/projects/libpetab-python/en/latest/build/_autosummary/petab.conditions.html#petab.conditions.write_condition_df [3] See documentation for noiseFormula in the observables table: https://petab.readthedocs.io/en/latest/documentation_data_format.html#id2

dweindl commented 2 years ago

What Dilan said.

Expected 0 but got 0 is not helpful indeed. Needs to be fixed.

matthiaskoenig commented 2 years ago

In the current version the index is not removed from the condition table. I just submitted a bugfix for that. I used the petab.write_condition_df in my code.

dweindl commented 2 years ago

In the current version the index is not removed from the condition table. I just submitted a bugfix for that. I used the petab.write_condition_df in my code.

Hi Matthias, the condition table DataFrame is expected to use conditionId as index. This will be set correctly when using petab.get_condition_df. After that, petab.write_condition_df is supposed to also write that index out. That might have to be documented better.

matthiaskoenig commented 2 years ago

Okay get it, but this is really not documented well. I assume this is the same for the other tables with ids. Just close my pull request.

FFroehlich commented 2 years ago

Okay get it, but this is really not documented well. I assume this is the same for the other tables with ids. Just close my pull request.

I agree, see also #26

matthiaskoenig commented 2 years ago

Sorry forgot. Probably would be a good idea to just set the index in the write_* functions which expect an index. Then not everybody has to set the index themselves. Also no harm done if already set. Make it easy for users to create the dataframes with minimal overhead.

df.set_index(CONDITION_ID)
df.to_csv(filename, sep='\t', index=True)
matthiaskoenig commented 2 years ago

Could you give me push access to the repository, so I can open pull requests without having to fork the repository? I assume all pull requests go against develop after rebasing, or is there anything else I have to know?

matthiaskoenig commented 2 years ago

Hi all, I updated my PETab as far as I understood the input, but I am still stuck with the following errors

Mismatch of noise parameter overrides in:
observableId                                     model__ER_icg
preequilibrationConditionId                                NaN
simulationConditionId          Caesar1961_task_icg_inf_Control
datasetId                                     Caesar1961_dset0
replicateId                                                 R1
time                                                     200.0
measurement                                               0.62
noiseParameters                                       0.058523
observableParameters                                       NaN
Name: 21, dtype: object
Expected 0 but got 0
Not OK

Here is my updated PETab icg_petab.zip

Sorry, I am new to noise models and noise parameters, so it is not completely clear to me what is expected. I have data from multiple sources with some of the data points having standard deviations, others not. So I stored this information as noiseParameters in my measurement table. In addition I now defined the noiseParameters in the parameter table which correspond to the noise models in my observable table.

I have the feeling PETab does not work with measurements where only a subset of the data has standard deviations?

dilpath commented 2 years ago

I have the feeling PETab does not work with measurements where only a subset of the data has standard deviations?

Do you want noise to be estimated for the measurements that do not have a noise value in the measurements table? If so, put the estimated noise parameter ID in the noiseParameters column of the measurements table, for the measurements that are missing a noise value. I would also suggest not setting the estimated noise parameter ID to match the special string that PEtab uses for substitution in noise formulas in the observables table. i.e. don't name an estimated noise parameter like noiseParameter${indexOfNoiseParameter}_${observableId}. Common naming might be e.g. sd_${observableId} but any other string would be fine.

If you do not want to estimate noise, you can specify e.g. 1 in the noiseParameters column for the missing values, if you want to default to least squares. Equivalently, you could also define two observables, one for measurements that have noise and one for those that do not. For measurements that do not have noise, leave the noiseParameters column in the measurements table empty and define the noiseFormula in the observables table as simply 1 (with a normal noise distribution).

Let me know if you'd like an example of any of these options.

dweindl commented 2 years ago

Sorry forgot. Probably would be a good idea to just set the index in the write_* functions which expect an index. Then not everybody has to set the index themselves. Also no harm done if already set. Make it easy for users to create the dataframes with minimal overhead.

df.set_index(CONDITION_ID)
df.to_csv(filename, sep='\t', index=True)

Sounds good. Needs to check whether there still is a CONDITION_ID column (there won't be if it's already set as index).

Could you give me push access to the repository, so I can open pull requests without having to fork the repository?

Invited.

I assume all pull requests go against develop after rebasing, or is there anything else I have to know?

Tests need to pass / appropriate tests should be added and code needs to be compatible with python3.7. That's all, I'd say.

dweindl commented 2 years ago

When running the linter over my files (attached below) I get the errors below.

* Not sure what to do wth the `Expected 0, but got 0` message

@matthiaskoenig: Running petablint on the provided files, I get Expected 1 but got 0 (+ invalid condition ID errors). Not sure what when wrong in your case. Can you still reproduce that? With which version of this package?

Update: I used the wrong set of files :see_no_evil: . Reproducible now.

Update2: Fixed in #129. Expected 0 but got 1.

matthiaskoenig commented 2 years ago

Closing this for now. Figured out some of the issues.

But for me it is still not clear how to work correctly with data where a subset has SD and another subset does not have such errors. Is there a meeting/help channel where I could discuss this with expert(s) for a few minutes?

dweindl commented 2 years ago

Is there a meeting/help channel where I could discuss this with expert(s) for a few minutes?

Could use the COMBINE-slack PEtab channel, the PEtab GitHub discussions page or we can arrange a call.