opencobra / cobratoolbox

The COnstraint-Based Reconstruction and Analysis Toolbox. Documentation:
https://opencobra.github.io/cobratoolbox
Other
247 stars 311 forks source link

Problem Literal nodes in FormulaParser #1058

Closed lvalcarcel closed 6 years ago

lvalcarcel commented 6 years ago

Please include a short description of problem here

I have tried to use the Formula parser algorithm with the rules of the new model Recon3D.

When I try to use FormulaParser in reaction ATPS4mi, I obtain some strange literal nodes:

x($450|$449) or x(3595) ......

Moreover, x(450) or x(449) are not present in the rule.

Using the following lines of code, I can detect all the literal nodes of this reaction:

fp = FormulaParser();
idx = findRxnIDs(model,'ATPS4mi'); 
head = fp.parseFormula(model.rules{idx});
unique(head.getLiterals())

'$160|$159' '$405|$404' '$417|$416' '$450|$449' '$9|$8' '3577' '3578' '3579' '3580' '3581' '3582' '3583' '3584' '3585' '3586' '3587' '3588' '3589' '3590' '3591' '3592' '3593' '3594' '3595' '3596' '3597' '3598' '3599' '3600' '3601' '3602' '3603' '3604' '3605' '3606' '3607' '3608' '3609' '3610' '3611' '3612'

I have used this function with previous versions of Recon and it has worked prefectly.

I hereby confirm that I have:

(Note: You may replace [] with [X] to check the box)

tpfau commented 6 years ago

This was a bug in FormulaParser, which was originally built for grRules, where all And/Or operators have to be separated by spaces, and this was not correctly adapted to rules, where this is not necessary (as | and & cannot be confused with parts of a gene name). Should be fixed by #1061

lvalcarcel commented 6 years ago

Thanks a lot @tpfau.
I have the same problem using grRules instead of rules:

'$160or$159' '$405or$404' '$417or$416' '$450or$449' '$9or$8' '10476.1' '10476.2' '10632.1' '267020.1' '27109.1' '4905.1' '498.1' '498.2' '498.3' '506.1' '509.1' '509.2' '513.1' '513.2' '514.1' '514.2' '515.1' '515.2' '515.3' '516.1' '516.2' '517.1' '517.2' '518.1' '518.2' '518.3' '521.1' '522.1' '522.2' '522.3' '522.4' '522.5' '539.1' '9551.1' '9551.2' '9551.3'

I am glad to read that this issue has been taken into account.

On the other hand, I have used rules instead of grRules following the example in the function 'GPRparser()': GPRparser

fp = FormulaParser();
head = fp.parseFormula(model.rules{i});
tpfau commented 6 years ago

Hmm... Ok, I need to have a look at the parser again. The only thing I can think of at the moment is that there is some oddity where an or is not separated, or becomes unseparated by the replacements. I'll need to check this.

tpfau commented 6 years ago

I updated the PR. This should now also be adressed: However: Be aware, that the formula Parser is only tested and only reliably works with rules presented in the logical format used in the rules field. Use on grRules "might" work but some functions assume that the literals are essentially numbers and can be used as indices for a vector (like Node.getFunctionalGeneSets).

laurentheirendt commented 6 years ago

Thanks for reporting @lvalcarcelg and fixing @tpfau! 👍