Data2Dynamics / d2d

a modeling environment tailored to parameter estimation in dynamical systems
https://github.com/Data2Dynamics/d2d
57 stars 29 forks source link

PEtab export fails to create 'conditions .tsv' file for example problem #170

Open vwiela opened 1 year ago

vwiela commented 1 year ago

I am currently trying to export the Merkle_JAK2STAT5_PCB2016 model from the examples folder to PEtab by running the SetupFinal.m script to load and compile the model with the corresponding data. But then when calling arExportPEtab('name', true) I get the following error:

Error using arExportPEtab (line 203)
The VariableNames property must contain one name for each variable in the table.

The part, which is failing is the creation of the condition table. Especially the following part of the arExportPEtab function:

rowToAdd = table;
for irow = 1:length(condPos)    
    if isnumeric(condVal(condPos == irow))
           tb = table(condVal(condPos == irow));
    else
           tb = table(condVal(condPos == irow));
    end
    tb.Properties.VariableNames = condPar(condPos == irow);    % <- call that fails
    rowToAdd = [rowToAdd tb];
end
condT = [condT; rowToAdd];

simuConditionID = ['model' num2str(1) '_data' num2str(idata)];
conditionID{end+1} = simuConditionID;

The problem for this example is that for the jak2_stat5_h838_l1_final model there exists 3 parameters specifying the experimental condition, which are written in peConds={'epo_level', 'isprediction', 'overexp'}. Then for the first data set only two are specified, leading to condPar={'overexp', 'epo_level'} and condPos=[3, 1]. Hence, in this case the above code creates at the second iteration step tb as a 1x1 table with columnname 'Var1' and entry 1x0 empty double row vector. But condPar(condPos == 2) is also just an empty object and cannot be assigned as a name to this one column table, which creates the error then.

Second problem is that the for loop would stop after 2 iterations, although the third parameter, that is actually apparent for this data set, is not stored yet.

My suggestion to overcome both, the error and the second problem, would be:

numOfConds = length(peConds)

rowToAdd = table;
for irow = 1:numOfConds    
   if isempty(condVal(condPos == irow))
       tb = table;
       tb.ONE = NaN;
   else
       tb = table(condVal(condPos == irow));
    end
    tb.Properties.VariableNames = peConds(irow);
    rowToAdd = [rowToAdd tb];
end
condT = [condT; rowToAdd];

This makes sure that for each dataset a row with three columns according to the three parameters is created, which can then be merged to a conditions table while parameters that are not specified for a dataset are getting NaN entries.

This workaround solved the issue for this example for me, maybe one can think of more elegant solutions.

Another issue with creating the 'condition.tsv' file is that the model has 75 datasets and therefore 75 different experimental conditions, but some of the combinations of parameters are exactly the same and the model specification which can be printed with arPrint indeed specifys that there are onlöy 62 experimental conditions. So maybe one should also consider a different/more appropriate creation of the conditionID. But I am not sure about this one