Closed saramsey closed 2 years ago
Hi @acevedol: we might want to hold off a bit to see if we can also get the fix for #141 in the 2.7.3 build, let's discuss
Hi @acevedol just an FYI, I have committed what I hope is a fix for #141 in the issue-141
branch. I am doing some more thorough testing now. If everything looks good, I will push that commit upstream to the master
branch so we can hopefully include that fix in the KG2.7.3 build.
OK, I have pushed the fix for #141 to the master
branch. I think it is ready for inclusion in the KG2.7.3 build.
deleted kg2-code on buildkg2.rtx.ai and cloned it from the repo.
From ~/kg2-build, ran
source ~/kg2-venv/bin/activate
python3 ~/kg2-code/validate_provided_by_to_infores_map_yaml.py ~/kg2-code/kg2-provided-by-curie-to-infores-curie.yaml ./infores-catalog.tsv
deactivate
It ran with no output, and from https://github.com/RTXteam/RTX-KG2/issues/104#issuecomment-893052649, that should mean there were no errors
Ran a dry run with bash -x ~/kg2-code/build-kg2-snakemake.sh all -F -n
The log file shows all 48 jobs, as expected.
Running full build with bash -x ~/kg2-code/build-kg2-snakemake.sh all -F
Build crashed at DrugCentral rule
Error in extract-drugcentral.log is Error: role "jjyang" does not exist
I'm not sure if this is an actual problem yet, but checking kg2-build/build-kg2-snakemake.log shows an error with Rule Unichem, but the script is still running
Changes to extract-drugcentral and a little bit of command line correcting finished extract-drugcentral correctly. Error for jjyang came from the role being created out of order
Another error
Presumably due to the UniChem error since the drug central script was able to complete successfully
Unichem error using curl -v
UniChem seems to be an access control problem. The script uses an anonymous user and is denied access. I'm digging in https://www.ebi.ac.uk for possible solutions.
UDRI version 385 appears to be current. 375 no longer available
Error in build-multi-ont-kg.log
I am still stuck on the above error. The line /usr/bin/java -Xms2G -Xmx255683G -DentityExpansionLimit=4086000 -Djava.awt.headless=true -classpath /home/ubuntu/kg2-build/owltools owltools.cli.CommandLineInterface biolink-model.owl.ttl -o -f json /tmp/kg2-97rgzijn.json
seems to be where it's stuck, but I can't find where to change the settings for the VM stack size
OK, this portion of the shell command strongly indicates a bug:
/usr/bin/java -Xms2G -Xmx255683G
since 255683G is 255 Terabytes (!). I am checking on the cause of this bug now...
I suspect my code changes to get-system-memory.sh
for #137 caused this bug. Testing that hunch now...
Commit 40d9a6a should fix the problem with build-multi-ont-kg.log
Another error in Ontologies rule
The fix above for get-system-memory.sh did correct the problem. Thank you, Steve!
Ontology exited with error
Reading ontology file: foodon.owl; size: 6944.69 KiB /usr/bin/java -Xms2G -Xmx249G -DentityExpansionLimit=4086000 -Djava.awt.headless=true -classpath /home/ubuntu/kg2-build/owltools owltools.cli.CommandLineInterface foodon.owl -o -f json /tmp/kg2-_afrl46i.json 2021-09-15 02:38:43,998 ERROR (CommandRunner:4815) could not parse:foodon.owl org.semanticweb.owlapi.model.UnloadableImportException: Could not load imported ontology: <http://purl.obolibrary.org/obo/foodon/imports/dietary_supplement_import.owl> Cause: https://raw.githubusercontent.com/FoodOntology/foodon/master/imports/dietary_supplement_import.owl at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.makeLoadImportRequest(OWLOntologyManagerImpl.java:1870) at org.semanticweb.owlapi.rdf.rdfxml.parser.TripleHandlers$TPImportsHandler.handleTriple(TripleHandlers.java:1537) at org.semanticweb.owlapi.rdf.rdfxml.parser.TripleHandlers$HandlerAccessor.handleStreaming(TripleHandlers.java:194) at org.semanticweb.owlapi.rdf.rdfxml.parser.OWLRDFConsumer.statementWithResourceValue(OWLRDFConsumer.java:1545) at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFParser.statementWithResourceValue(RDFParser.java:370) at org.semanticweb.owlapi.rdf.rdfxml.parser.EmptyPropertyElement.startElement(StartRDF.java:236) at org.semanticweb.owlapi.rdf.rdfxml.parser.PropertyElementList.startElement(StartRDF.java:658) at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFParser.startElement(RDFParser.java:201) at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) at org.apache.xerces.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source) at org.apache.xerces.impl.dtd.XMLDTDValidator.emptyElement(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFParser.parse(RDFParser.java:145) at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser.parse(RDFXMLParser.java:73) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:220) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.actualParse(OWLOntologyManagerImpl.java:1254) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1208) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1108) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1064) at owltools.io.ParserWrapper.parseOWL(ParserWrapper.java:163) at owltools.io.ParserWrapper.parseOWL(ParserWrapper.java:150) at owltools.io.ParserWrapper.parse(ParserWrapper.java:132) at owltools.cli.CommandRunner.runSingleIteration(CommandRunner.java:4803) at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:76) at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:68) at owltools.cli.CommandLineInterface.main(CommandLineInterface.java:12) Caused by: org.semanticweb.owlapi.io.OWLOntologyCreationIOException: https://raw.githubusercontent.com/FoodOntology/foodon/master/imports/dietary_supplement_import.owl at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:230) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.actualParse(OWLOntologyManagerImpl.java:1254) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1208) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1108) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadImports(OWLOntologyManagerImpl.java:1825) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.makeLoadImportRequest(OWLOntologyManagerImpl.java:1863) ... 33 more Caused by: java.io.FileNotFoundException: https://raw.githubusercontent.com/FoodOntology/foodon/master/imports/dietary_supplement_import.owl at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1974) at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1969) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1968) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520) at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:250) at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStreamFromContentEncoding(AbstractOWLParser.java:179) at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStream(AbstractOWLParser.java:141) at org.semanticweb.owlapi.io.AbstractOWLParser.getInputSource(AbstractOWLParser.java:264) at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser.parse(RDFXMLParser.java:72) at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:220) ... 38 more Caused by: java.io.FileNotFoundException: https://raw.githubusercontent.com/FoodOntology/foodon/master/imports/dietary_supplement_import.owl at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1920) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520) at java.base/sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:3099) at java.base/java.net.URLConnection.getContentEncoding(URLConnection.java:530) at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getContentEncoding(HttpsURLConnectionImpl.java:406) at org.semanticweb.owlapi.io.AbstractOWLParser.getInputStream(AbstractOWLParser.java:136) ... 41 more Traceback (most recent call last): File "/home/ubuntu/kg2-code/multi_ont_to_json_kg.py", line 1362, in <module> save_pickle) File "/home/ubuntu/kg2-code/multi_ont_to_json_kg.py", line 143, in make_kg2 save_pickle) File "/home/ubuntu/kg2-code/multi_ont_to_json_kg.py", line 67, in load_ont_file_return_ontology_and_metadata ontology = kg2_util.make_ontology_from_local_file(file_name, save_pickle=save_pickle) File "/home/ubuntu/RTX-KG2/kg2_util.py", line 783, in make_ontology_from_local_file check=True) File "/usr/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['owltools', 'foodon.owl', '-o', '-f', 'json', '/tmp/kg2-_afrl46i.json']' returned non-zero exit status 1.
Error in Rule Simplify
relation curie is missing from the YAML config file: RO:0002470
There are relation curies missing from the yaml config file. Please add them and try again. Exiting.
KG2 build completed. The complete log is at /home/ubuntu/.snakemake/log/2021-09-16T181654.939719.snakemake.log
On build instance, kg2-version.txt shows 2.7.3
and tagged repo with KG2.7.3
kg2-simplified-report.json grew from 48KB to 50 KB 4308 more edges 1525 more nodes
I accidentally deleted the checklist at the top of this
Installed new tsv files on kg2enpoint4.rtx.ai.
The log file kg2-build/setup-kg2-neo4j.log
ends with ======= script finished ======
Updated CNAME for kg2endpoint4.rtx.ai to point to kg2endpoint-kg2-7-3.rtx.ai
Made directory on rtxconfig@arax.ncats.io for KG2.7.3
updated kg2c_config.json
Error running synonymizer. Appears similar to an error from the last build
ok, fix is pushed to the kg2integration
branch (in the RTX repo). (was related to Finn's changes to configv2.json
yesterday -- made some tweaks so the KG2c build shouldn't be so fragile in terms of config files anymore in https://github.com/RTXteam/RTX/commit/2509ca4314d67a1e5185ba2bcc18b02c0cc27fad)
also worth noting that this KG2c build should be done from the kg2integration
branch - probably should add a step to the build checklist to touch base about which branch the KG2c build should be done from. :)
Added "Check with Amy which branch to use for building kg2c" to checklist
NodeSynonymizer build finished with 2021-09-18 20:43:23,810 INFO: Done building synonymizer.
Checked arax.ncats.io to make sure synonymizer files are presen
t
Updated kg2c_config.json to build kg2c
BuildKG2C completed and the files on arax.ncats.io are slightly larger than in KG2.7.2C
Loading kg2c into Neo4J on kg2canonicalized.rtx.ai with bash -x RTX/code/kg2c/tsv-to-neo4j-canonicalized.sh
Neo4J loading completed with
Mon Sep 20 17:46:33 UTC 2021
+ echo '================ script finished ============================'
================ script finished ============================
Updated CName for kg2canonicalized.rtx.ai
Validated results at http://kg2-7-3c.rtx.ai:7474/browser/ and results make sense
This was finished a while back
Aiming to do a build KG2.7.3 the week of Aug. 30 - Sep. 3, to get the fix for #131 out as soon as possible.