Closed zakandrewking closed 9 years ago
Hey so was just talking with justin about this and ran into an issue.
The first is that most models don't have associated KEGG IDs. Ideally they all would and maybe I am wrong but pretty sure only 10-20% of what we load will have them.
Second is that we probably have to have our own non-external unique ids for universal components because there are going to be many more types of components than just metabolites. The above scheme would work but it would have a massive number of flagged rows.
I can't check on this right now, but I remember going into Simpheny and seeing KEGG ids for almost every metabolite I checked. Definitely the central metabolic ones. I wonder if they never made it into GRMIT?
It's OK to have a massive number of flagged rows. We have to deal with this someday, and there are many automated approaches to consider. Andreas has done something very similar.
For non-metabolites, we should try to come up with external IDs where possible. For anything that's part of a template reaction, we can get fancy. For instance, a transcription elongation reaction could be linked to the reaction template AND to an external gene ID. But we don't have to solve that immediately.
The Kegg and Cas IDs were imported from Simpheny.
On Sun, Sep 21, 2014 at 7:33 PM, Zachary King notifications@github.com wrote:
I can't check on this right now, but I remember going into Simpheny and seeing KEGG ids for almost every metabolite I checked. Definitely the central metabolic ones. I wonder if they never made it into GRMIT?
It's OK to have a massive number of flagged rows. We have to deal with this someday, and there are many automated approaches to consider. Andreas has done something very similar.
For non-metabolites, we should try to come up with external IDs where possible. For anything that's part of a template reaction, we can get fancy. For instance, a transcription elongation reaction could be linked to the reaction template AND to an external gene ID. But we don't have to solve that immediately.
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-56323124.
Am 09/05/14 um 19:33 schrieb Zachary King:
I can't check on this right now, but I remember going into Simpheny and seeing KEGG ids for almost every metabolite I checked. Definitely the central metabolic ones. I wonder if they never made it into GRMIT?
It's OK to have a massive number of flagged rows. We have to deal with this someday, and there are many automated approaches to consider. Andreas has done something very similar.
For non-metabolites, we should try to come up with external IDs where possible. For anything that's part of a template reaction, we can get fancy. For instance, a transcription elongation reaction could be linked to the reaction template AND to an external gene ID. But we don't have to solve that immediately.
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-56323124.
Hi guys,
IMHO, having an own ID schema in addition to providing references to external KEGG IDs would be a great idea. If BiGG ids would be consistent and unique, other users could refer to us instead of pointing to KEGG etc. It would be very nice if many models would contain references to BiGG, ultimately increasing our access count when people look those up.
Cheers Andreas
Dr. Andreas Draeger University of California, San Diego, La Jolla, CA 92093-0412, USA Bioengineering Dept., Systems Biology Research Group, Office #2506 Phone: +1-858-534-9717, Fax: +1-858-822-3120, twitter: @dr_drae
Ok, so how does this sound as a temporary solution.
select * from metabolite where kegg_id is null;
should do the trick in keeping track of metabolites which need curation.What do you think?
Actually in re-reading this is essentially exactly what you originally proposed??
Bingo :8ball: (don't read into that)
Yeah, right now I made it so that the universal metabolite will update its kegg id if it has a missing kegg id and another metabolite with the same name and has a kegg id is uploaded into the database.
On Tue, Sep 23, 2014 at 5:53 PM, Zachary King notifications@github.com wrote:
Bingo [image: :8ball:](don't read into that)
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-56611743.
Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.
Hmmmm ok, I would be fine with metanetx. I was always pretty impressed with their atom mapping work. I'm not sure if it has been fully implemented within metanetx yet but I think it will be and at that point I think it will be a pretty dominantly sophisticated resource. I think the downside is that KEGG has a lot of visibility outside of constraint-based sysbio and if we go with metanetx ids we are potentially losing some visibility. However, the upside is that costass and metanetx aren't going anywhere and will only continue to get better. Costas is also a friendly lab and so if an official collaboration needed to happen or larger things were to move forward then it would likely be a good situation.
Sent from my iPhone
On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com wrote:
Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.
— Reply to this email directly or view it on GitHub.
We aren't technically using KEGG as universal ids: We are using KEGG to generate universal BIGG ids. So we don't have to limit ourselves to one type of external reference ID. It's worth thinking about this more.
Jon already has metanetx ids for his models?
On Fri, Oct 17, 2014 at 3:37 PM, Steve Federowicz notifications@github.com wrote:
Hmmmm ok, I would be fine with metanetx. I was always pretty impressed with their atom mapping work. I'm not sure if it has been fully implemented within metanetx yet but I think it will be and at that point I think it will be a pretty dominantly sophisticated resource. I think the downside is that KEGG has a lot of visibility outside of constraint-based sysbio and if we go with metanetx ids we are potentially losing some visibility. However, the upside is that costass and metanetx aren't going anywhere and will only continue to get better. Costas is also a friendly lab and so if an official collaboration needed to happen or larger things were to move forward then it would likely be a good situation.
Sent from my iPhone
On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com wrote:
Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-59585434.
I don't think so actually. He might have some but it's not in his sbmls for certain. But today I discussed with Jon on pulling out the kegg ids and cas numbers and then putting them into his cobrapy objects. He'll be sending his new models (w/ kegg ids) once he's done updating his python script.
On Fri, Oct 17, 2014 at 3:48 PM, Zachary King notifications@github.com wrote:
We aren't technically using KEGG as universal ids: We are using KEGG to generate universal BIGG ids. So we don't have to limit ourselves to one type of external reference ID. It's worth thinking about this more.
Jon already has metanetx ids for his models?
On Fri, Oct 17, 2014 at 3:37 PM, Steve Federowicz < notifications@github.com> wrote:
Hmmmm ok, I would be fine with metanetx. I was always pretty impressed with their atom mapping work. I'm not sure if it has been fully implemented within metanetx yet but I think it will be and at that point I think it will be a pretty dominantly sophisticated resource. I think the downside is that KEGG has a lot of visibility outside of constraint-based sysbio and if we go with metanetx ids we are potentially losing some visibility. However, the upside is that costass and metanetx aren't going anywhere and will only continue to get better. Costas is also a friendly lab and so if an official collaboration needed to happen or larger things were to move forward then it would likely be a good situation.
Sent from my iPhone
On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com wrote:
Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-59585434.
— Reply to this email directly or view it on GitHub https://github.com/SBRG/BIGG2/issues/9#issuecomment-59586333.
Am 17.10.14 um 19:35 schrieb Justin Lu:
I don't think so actually. He might have some but it's not in his sbmls for certain. But today I discussed with Jon on pulling out the kegg ids and cas numbers and then putting them into his cobrapy objects. He'll be sending his new models (w/ kegg ids) once he's done updating his python script.
Let's talk about all this on Tuesday during code talk. I think this is very important and deserves a few words of direct discussion.
Dr. Andreas Draeger University of California, San Diego, La Jolla, CA 92093-0412, USA Bioengineering Dept., Systems Biology Research Group, Office #2506 Phone: +1-858-534-9717, Fax: +1-858-822-3120, twitter: @dr_drae