galaxy-genome-annotation / docker-tripal

Docker container for Tripal
10 stars 11 forks source link

Need taxonomic rank ontology #6

Closed abretaud closed 7 years ago

abretaud commented 7 years ago

Tripal needs the taxonomic rank ontology to be loaded in chado (required to create an organism from the tripal loader), it's available there: http://purl.obolibrary.org/obo/taxrank.obo Maybe it could be added to the Chado-Prebuilt-Schemas ? Also we could replace http://www.obofoundry.org/ro/ro.obo by http://purl.obolibrary.org/obo/ro.obo to fix the 404 error?

hexylena commented 7 years ago

:(

./Build ontologies                                                                                                                            [26/9827]
Available ontologies:
[1] Relationship Ontology
[2] Sequence Ontology
[3] Gene Ontology
[4] Chado Feature Properties
[5] Plant Ontology
[6] Taxonomic Rank

Which ontologies would you like to load (Comma delimited)? [0]  fetching files for Relationship Ontology
  +http://purl.obolibrary.org/obo/ro.obo
    updated
    loading...ERROR:  null value in column "cv_id" violates not-null constraint
DETAIL:  Failing row contains (33, null, null, null, 33, 0, 0).
STATEMENT:  INSERT INTO cvterm (dbxref_id) VALUES ($1)
DBD::Pg::st execute failed: ERROR:  null value in column "cv_id" violates not-null constraint
DETAIL:  Failing row contains (33, null, null, null, 33, 0, 0). [for Statement "INSERT INTO cvterm (dbxref_id) VALUES (?)" with ParamValues: 1='33'] at
 /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  null value in column "cv_id" violates not-null constraint
DETAIL:  Failing row contains (33, null, null, null, 33, 0, 0). [for Statement "INSERT INTO cvterm (dbxref_id) VALUES (?)" with ParamValues: 1='33'] at
 /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
 at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3332.
        DBIx::DBStag::insertrow(DBIx::DBStag=HASH(0x11b4058), "cvterm", HASH(0x1c87af0), "cvterm_id") called at /usr/local/share/perl/5.20.2/DBIx/DBSta
g.pm line 1928
        DBIx::DBStag::_storenode(DBIx::DBStag=HASH(0x11b4058), Data::Stag::StagImpl=ARRAY(0x23f69c8)) called at /usr/local/share/perl/5.20.2/DBIx/DBSta
g.pm line 1180
        DBIx::DBStag::storenode(DBIx::DBStag=HASH(0x11b4058), Data::Stag::StagImpl=ARRAY(0x23f6320)) called at /usr/local/bin/stag-storenode.pl line 88
        eval {...} called at /usr/local/bin/stag-storenode.pl line 87
        main::store(Data::Stag::BaseHandler=HASH(0x1f4d148), Data::Stag::StagImpl=ARRAY(0x23f6320)) called at /usr/local/bin/stag-storenode.pl line 137
        main::__ANON__(Data::Stag::BaseHandler=HASH(0x1f4d148), Data::Stag::StagImpl=ARRAY(0x23f6320)) called at /usr/local/share/perl/5.20.2/Data/Stag
/BaseHandler.pm line 601
        Data::Stag::BaseHandler::end_event(Data::Stag::BaseHandler=HASH(0x1f4d148), "cvterm") called at /usr/local/share/perl/5.20.2/Data/Stag/BaseHand
ler.pm line 742
        Data::Stag::BaseHandler::end_element(Data::Stag::BaseHandler=HASH(0x1f4d148), HASH(0x23f6d40)) called at /usr/local/share/perl/5.20.2/XML/Parse
r/PerlSAX.pm line 239
        XML::Parser::PerlSAX::_handle_end(XML::Parser::PerlSAX=HASH(0x1fc5980), XML::Parser::Expat=HASH(0x1e96778), "cvterm") called at /usr/local/shar
e/perl/5.20.2/XML/Parser/PerlSAX.pm line 79
        XML::Parser::PerlSAX::__ANON__(XML::Parser::Expat=HASH(0x1e96778), "cvterm") called at /usr/local/lib/x86_64-linux-gnu/perl/5.20.2/XML/Parser/E
xpat.pm line 474
        XML::Parser::Expat::parse(XML::Parser::Expat=HASH(0x1e96778), FileHandle=GLOB(0x214d958)) called at /usr/local/lib/x86_64-linux-gnu/perl/5.20.2
/XML/Parser.pm line 187
        eval {...} called at /usr/local/lib/x86_64-linux-gnu/perl/5.20.2/XML/Parser.pm line 186
        XML::Parser::parse(XML::Parser=HASH(0x1f44868), FileHandle=GLOB(0x214d958)) called at /usr/local/share/perl/5.20.2/XML/Parser/PerlSAX.pm line 1
47
        XML::Parser::PerlSAX::parse(XML::Parser::PerlSAX=HASH(0x1fc5980), "Source", HASH(0x1fc5920), "Handler", Data::Stag::BaseHandler=HASH(0x1f4d148)
) called at /usr/local/share/perl/5.20.2/Data/Stag/XMLParser.pm line 69
        Data::Stag::XMLParser::parse_fh(Data::Stag::XMLParser=HASH(0x1f4cd40), FileHandle=GLOB(0x214d958)) called at /usr/local/share/perl/5.20.2/Data/
Stag/BaseGenerator.pm line 476
        Data::Stag::BaseGenerator::parse(Data::Stag::XMLParser=HASH(0x1f4cd40), "-file", "tmp/obo/OBO_REL/ro.oboxml", "-str", undef, "-fh", undef) call
ed at /usr/local/share/perl/5.20.2/Data/Stag/XMLParser.pm line 58
        Data::Stag::XMLParser::parse(Data::Stag::XMLParser=HASH(0x1f4cd40), "-file", "tmp/obo/OBO_REL/ro.oboxml", "-str", undef, "-fh", undef) called a
t /usr/local/share/perl/5.20.2/Data/Stag/StagImpl.pm line 275
        Data::Stag::StagImpl::parse("Data::Stag", "-format", undef, "-file", "tmp/obo/OBO_REL/ro.oboxml", "-handler", Data::Stag::BaseHandler=HASH(0x1f
4d148)) called at /usr/local/share/perl/5.20.2/Data/Stag.pm line 181
        Data::Stag::AUTOLOAD("Data::Stag", "-format", undef, "-file", "tmp/obo/OBO_REL/ro.oboxml", "-handler", Data::Stag::BaseHandler=HASH(0x1f4d148))
 called at /usr/local/bin/stag-storenode.pl line 143
System call 'stag-storenode.pl -d 'dbi:Pg:dbname=postgres;host=localhost;port=5432' --user postgres  --password 'postgres'  tmp/obo/OBO_REL/ro.oboxml' 
failed: 256
failed: 256 at lib/Bio/Chado/Builder.pm line 368.
Makefile:1287: recipe for target 'ontologies' failed
make: *** [ontologies] Error 2
Makefile:5: recipe for target 'schema' failed
make: *** [schema] Error 2
hexylena commented 7 years ago

@scottcain helpfully provided the following alternative in IRC: https://raw.githubusercontent.com/oborel/obo-relations/master/subsets/ro-chado.obo unfortunately this crashes as well with

Which ontologies would you like to load (Comma delimited)? [0]  fetching files for Relationship Ontology
  +https://raw.githubusercontent.com/oborel/obo-relations/master/subsets/ro-chado.obo
    updated
    loading...ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(32) already exists.
STATEMENT:  INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES ($1, $2, $3, $4)
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(32) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES (?, ?, ?, ?)" wit
h ParamValues: 1='11', 2='is_a', 3='1', 4='32'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(32) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES (?, ?, ?, ?)" wit
h ParamValues: 1='11', 2='is_a', 3='1', 4='32'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
 at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3332.
hexylena commented 7 years ago

Finally got around to extracting the project.

https://github.com/erasche/chado-schema-builder

It could? should? move to GMOD probably, but I'm happy to maintain it.

abretaud commented 7 years ago

Thanks! The error is... strange. Sometimes it works, sometimes not.

I've enabled debug log (DBSTAG_TRACE env var) and it seems to be related to the "is_a stanza" in ro-chado.obo

The error goes away (both in chado 1.23 and 1.31) if I just remove it (and the OBO_REL:is_a appears to be created in the db anyway) : https://raw.githubusercontent.com/abretaud/obo-relations/master/subsets/ro-chado.obo

This "is_a" stuff was somewhat discussed in oborel/obo-relations#68 ping @scottcain, maybe you understand better than me what happens here?

hexylena commented 7 years ago

@abretaud well....let's use that. Got a copy building now.

hexylena commented 7 years ago
Which ontologies would you like to load (Comma delimited)? [0]  fetching files for Relationship Ontology
  +https://raw.githubusercontent.com/abretaud/obo-relations/master/subsets/ro-chado.obo
    updated
    loading...done!

great work @abretaud! :)

hexylena commented 7 years ago

https://cpt.tamu.edu/jenkins/job/Chado-Prebuilt-Schemas/62/console I'm restarting the jobs... hopefully with this fix we'll start being able to use the latest builds instead of ancient ones :D

abretaud commented 7 years ago

Let's cross fingers! I made a PR there: oborel/obo-relations#115, we'll see if it gets merged.

Is there a particular reason to use the master branch from chado? Its status looks like it's in-between the 1.23 and 1.31 releases (more than 1.23, but less than 1.31)

hexylena commented 7 years ago

Mostly because it was what I always did.

I was just looking into automatically using the version in the output filenames, couldn't figure out why I couldn't find 1.31 anywhere. Glad you noticed this :)

hexylena commented 7 years ago
fetching files for Gene Ontology
  +http://www.geneontology.org/ontology/gene_ontology.obo
    updated
    loading...ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(532) already exists.
STATEMENT:  INSERT INTO cvterm (cv_id, name, dbxref_id, is_relationshiptype) VALUES ($1, $2, $3, $4)
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(532) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, dbxref_id, is_relationshiptype) VALUES (?, ?, ?, ?)" with ParamValues: 1='16', 2='part_of', 3='532', 4='1'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(532) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, dbxref_id, is_relationshiptype) VALUES (?, ?, ?, ?)" wi
th ParamValues: 1='16', 2='part_of', 3='532', 4='1'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
 at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3332.
        DBIx::DBStag::insertrow(DBIx::DBStag=HASH(0x1342058), "cvterm", HASH(0x1e15cf0), "cvterm_id") called at /usr/local/share/perl/5.20.2/DBIx/DBSta

on master just now. Switching to 1.31 and trying again.

abretaud commented 7 years ago

That's worrying if it fails too on GO... FYI I'll be away from internet next week Cheers Anthony

hexylena commented 7 years ago

Getting closer...

Which ontologies would you like to load (Comma delimited)? [0]  fetching files for Relationship Ontology
  +https://raw.githubusercontent.com/abretaud/obo-relations/master/subsets/ro-chado.obo
    updated
    loading...done!
fetching files for Sequence Ontology
  +http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo
    updated
    loading...done!
fetching files for Gene Ontology
  +http://www.geneontology.org/ontology/gene_ontology.obo
    updated
    loading...done!
fetching files for Chado Feature Properties
  +load/etc/feature_property.obo
    loading...done!
fetching files for Plant Ontology
  +http://palea.cgrb.oregonstate.edu/viewsvn/Poc/trunk/ontology/OBO_format/po_anatomy.obo?view=co
    updated
    loading...ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(494) already exists.
STATEMENT:  INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES ($1, $2, $3, $4)
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(494) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES (?, ?, ?, ?)" with ParamValues: 1='16', 2='has_part', 3='1', 4='494'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(494) already exists. [for Statement "INSERT INTO cvterm (cv_id, name, is_relationshiptype, dbxref_id) VALUES (?, ?, ?, ?)" wi
th ParamValues: 1='16', 2='has_part', 3='1', 4='494'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
 at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3332.
        DBIx::DBStag::insertrow(DBIx::DBStag=HASH(0x1f99058), "cvterm", HASH(0x2a6cb60), "cvterm_id") called at /usr/local/share/perl/5.20.2/DBIx/DBSta
g.pm line 1928
        DBIx::DBStag::_storenode(DBIx::DBStag=HASH(0x1f99058), Data::Stag::StagImpl=ARRAY(0x27e2a18)) called at /usr/local/share/perl/5.20.2/DBIx/DBSta
hexylena commented 7 years ago

Builds are now working on https://cpt.tamu.edu/jenkins/job/Chado-Prebuilt-Schemas

This is the latest error if anyone has comments on how to grok it.

ALREADY CALCULATED; not a Macro ID:474;; in cvterm/dbxref_id
USETS:  unique[ name cv_id is_obsolete ] unique[ dbxref_id ]
COLS: cvterm_id cv_id name definition dbxref_id is_obsolete is_relationshiptype
TRYING USET: ;cvterm_id; [pk=cvterm_id]
TRYING USET: ;name cv_id is_obsolete; [pk=cvterm_id]
USING DEFAULT[type=int4] is_obsolete => "0"
GOT unique_constr, select_col=cvterm_id
SQL: SELECT DISTINCT cvterm_id FROM cvterm WHERE name = 'adjacent_to' AND is_obsolete = '0' AND cv_id = '16'
SQL:INSERT INTO cvterm (is_relationshiptype, name, dbxref_id, cv_id) VALUES (?, ?, ?, ?)
ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(474) already exists.
STATEMENT:  INSERT INTO cvterm (is_relationshiptype, name, dbxref_id, cv_id) VALUES ($1, $2, $3, $4)
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(474) already exists. [for Statement "INSERT INTO cvterm (is_relationshiptype, name, dbxref_id, cv_id) VALUES (?, ?, ?, ?)" with ParamValues: 1='1', 2='adjacent_to', 3='474', 4='16'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
DBD::Pg::st execute failed: ERROR:  duplicate key value violates unique constraint "cvterm_c2"
DETAIL:  Key (dbxref_id)=(474) already exists. [for Statement "INSERT INTO cvterm (is_relationshiptype, name, dbxref_id, cv_id) VALUES (?, ?, ?, ?)" with ParamValues: 1='1', 2='adjacent_to', 3='474', 4='16'] at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3322.
 at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 3332.
    DBIx::DBStag::insertrow(DBIx::DBStag=HASH(0x22fd088), "cvterm", HASH(0x2e435c0), "cvterm_id") called at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 1928
    DBIx::DBStag::_storenode(DBIx::DBStag=HASH(0x22fd088), Data::Stag::StagImpl=ARRAY(0x3548de8)) called at /usr/local/share/perl/5.20.2/DBIx/DBStag.pm line 1180
    DBIx::DBStag::storenode(DBIx::DBStag=HASH(0x22fd088), Data::Stag::StagImpl=ARRAY(0x3546fd8)) called at /usr/local/bin/stag-storenode.pl line 88
    eval {...} called at /usr/local/bin/stag-storenode.pl line 87
scottcain commented 7 years ago

Is it possible that another cv added a term called "is_a" and the code you've written is inadvertently using it rather than the one that should be* coming from relationship? That has happened to me in the past.

*I'm using "should be" here in the pre-relations ontology sense--when a canonical "is_a" came from the relationship ontology.

abretaud commented 7 years ago

Back! I don't know if it's normal, Jenkins is not responding right now, I'll check later How did you fixed the errors on gene ontology and then plant ontology? Just rerunning it?

For the oborel/obo-relations#115, the comment from @cybersiddhu is interesting, there would be an additional "is_a" added by the perl obo parser while converting from obo to xml. So this means that oborel/obo-relations#115 would break the obo file for people not using the perl parser...

Could there be a way for stag-storenode.pl to just ignore the duplicate key violations?

hexylena commented 7 years ago

@abretaud ok, server is back too. Sorry about that.

Some of the errors were fixed by re-running and removing the is_a/magic statements from scott, but unfortunately did not fix all of them.

abretaud commented 7 years ago

This is done!