Closed IVANDOMENZAIN closed 3 years ago
This is currently being developed in a fork repository available here.
KmxGEM was missing the .xml
on master. Now it exists, so let's see how the auto-run pipeline behaves.
iYali has a problem, perhaps my GECKO fork should be updated?
INFO:__main__:
cd geckomat
model = load('/home/jenkins/ecModels-dependencies/Yali-GEM/ModelFiles/mat/iYali.mat');
model = model.model;
modelname = 'ecYaliGEM';
[ecModel, ecModel_batch] = enhanceGEM(model,'COBRA', modelname, '4.1.1');
cd ../models
save([modelname '/' modelname '.mat']','ecModel');
save([modelname '/' modelname '_batch.mat'], 'ecModel_batch');
quit
{ Unable to perform assignment with 0 elements on the right-hand side.
Error in manualModifications>otherChanges (line 425)
rxn_id = model.rxns{pos_rxn};
Error in manualModifications (line 113)
model = otherChanges(model);
Error in enhanceGEM (line 37)
[ecModel,modifications] = manualModifications(ecModel);
}
For Human-GEM I think we arrived to some conclusion wrt not providing any tissue GEMs, so there would be only one ecHuman-GEM which would make it fit with the rest of the pipeline, but I might be confused.
You're right @mihai-sysbio, for HumanGEM we're just thinking on adding the original general model into the ecModels pipeline.
I'm jotting down some ideas to discuss at the next meeting with @IVANDOMENZAIN and @BenjaSanchez :
databases
folders have been deleted, with content moved to scripts
. The contents of all the scripts
folders should be revised.dependencies.txt
files have been removed.model
folder can contain some files that need to be standardized across all models: kcatModifications.txt
, topUsedEnzymes.txt
~and model_batch.xml
~. ecXyzGEM
.Follow-up:
scripts/
folder - it doesn't look like it's moved over properlyscripts/
folderDatabases
and databases
folders.After fixing the issue with merging the content of the scripts
folder, the pipeline successfully created eciYali
in only about half an hour.
However, ecHuman
still fails after 2.5h of computation.
Great news about iYali
. Regarding ecHuman
, it is a large model, so it is expected to take longer than the others, specially in the steps of Kcat matching and saveECmodel (conversion to SBML). I can see the MATLAB log provided by GitHub actions but I cannot find any error message.
I can see the last MATLAB line being: prot_abundance file is not available. A default value of f=0.5 is set instead
, this means that the pipeline is entering the function getConstrainedModel.m
but apparently finding problems there. I will run this locally from this specific branch in order to find what's going on. Thanks for the update @mihai-sysbio.
The error is printed just above the Matlab output:
Error using getIndexes (line 70)
Incorrect value of the 'type' parameter. Allowed values are 'rxns', 'mets',
'metnames', 'metcomps' or 'genes'.
Error in changeMedia_batch (line 119)
metIndx = getIndexes(model,strcat(mediaComps{i},'[s]'),'metscomps');
Error in getConstrainedModel (line 46)
ecModel = changeMedia_batch(ecModel,c_source);
Error in enhanceGEM (line 63)
[ecModel_batch,OptSigma] = getConstrainedModel(ecModel,modifications,name);
Yeah, about that prot_abundance
file. From what I could see, this file is deleted only for Human-GEM here. Since this deletion is model specific, I went around it by overwriting that file with an empty file with the same name. I guess GECKO reacts differently to an empty file versus the file not being present?
Good! that you're pointing this out @mihai-sysbio, the file prot_abundance.txt
should actually be deleted for all those models for which such file is not provided in the scripts
folder (human, yarrowia, marxianus).
I see. Personally, I would find it simpler to provide an empty file and update a line in measureAbundance to also check for the file size to be non-zero. I'm thinking this would be more transparent to the user that
all files provided in
scripts
override what is in GECKO
and to avoid having an exception like
if specific files are not provided in
scripts
they are deleted from GECKO.
I see. Personally, I would find it simpler to provide an empty file and update a line in measureAbundance to also check for the file size to be non-zero. I'm thinking this would be more transparent to the user that
all files provided in
scripts
override what is in GECKOand to avoid having an exception like
if specific files are not provided in
scripts
they are deleted from GECKO.
@mihai-sysbio, a problem with this fix would be that several projects that have incorporated proteomics data into ecYeastGEM rely on this function, and this is an actual crucial step for their incorporation. I also like your suggested fix but this will also break backwards compatibility for those other projects. Easy to solve If I were the only person using this for proteomics but now we even have users from other groups relying on this.
@mihai-sysbio What about a halfway solution? by adding a second conditional in GECKO/geckomat/limit_proteins/measureAbundance from:
if exist(fileName,'file')~= 0
to:
if exist(fileName,'file')~= 0
fID = fopen('../../databases/prot_abundance.txt');
data = textscan(fID,'%s','delimiter','\n');
if ~isempty(data{1})
In this case empty files can also be compatible with this and wouldn't break compatibility for those users relying on scripts that delete the prot_abundance.txt
file prior to proteomics integration.
That sounds excellent! Let me know what branch of GECKO this is on, so I can use that instead of master
. And I'll add an empty prot_abundance.txt
to all 4 GEMs (listed at the top of this issue).
@mihai-sysbio, the fix has been introduced and tested both for an empty and non-empty prot_abundance.txt
file, you can find it here
These are 3 models remaining at the top of this issue:
ecYeast
I would need to know which Cobratoolbox version can successfully write (output) the model. 3.1
has been shown to not work.ecKmarx
several attempts have resulted in the same error:
Output argument "OptSigma" (and maybe others) not assigned during call to "getConstrainedModel".
Error in enhanceGEM (line 63)
[ecModel_batch,OptSigma] = getConstrainedModel(ecModel,modifications,name);
iML1515
since it's not on the update_all_ecmodels
branchfor
ecYeast
I would need to know which Cobratoolbox version can successfully write (output) the model. 3.1 has been shown to not work.
It turns out that the pipeline was failing with the same error for several versions of yeast-GEM (everything from yeast8.2.0 onwards), due to an addition in the notes field of the COBRA structure. I have addressed this with a temporary fix in https://github.com/SysBioChalmers/GECKO/pull/125, an opened https://github.com/SysBioChalmers/GECKO/issues/126 as a reminder to better address the problem in the future.
I still think we should pin COBRA to avoid future issues like this, until they have a more consistent releasing scheme. In particular, I confirmed the GECKO pipeline is fully working with commit 6c49aaf8d
.
Excellent! I've changed the COBRA version to that commit, and used the fix/remove-met-notes
branch of GECKO. ~However, it still fails to write the model for ecYeast.~
edit: ecYeast now works! see this PR
Now that eciSM996
works, what do you think about #29 - are we ready to merge in the MetabolicAtlas/ecModels
fork?
I'm thinking of closing this issue as is, and opening a new one specifically for the (potential) problems in ecMouseGEM.
@mihai-sysbio I agree
Complete automation of the ecModels pipeline for the following models: