Closed dhimmel closed 8 years ago
If you want to use Neo4j 2.3, change the Docker commands to:
docker pull neo4j:2.3.5
docker run \
--publish=7474:7474 \
--volume=$OLS_HOME/neo4j:/data/graph.db \
--env=NEO4J_AUTH=none \
--env=NEO4J_ALLOW_STORE_UPGRADE=true \
neo4j:2.3.5
Here's a version of the subterm query that reports the
min_paths
) and n_paths
)to the specified node:
MATCH path = (n:GO)<-[:SUBCLASSOF*..]-()
WHERE n.obo_id = 'GO:0051223'
WITH nodes(path) AS nodes, n
UNWIND nodes AS node
WITH DISTINCT node AS node, n
RETURN
node.obo_id AS identifier,
node.label AS name,
length(shortestPath((n)<-[:SUBCLASSOF*..]-(node))) AS min_depth,
size((n)<-[:SUBCLASSOF*..]-(node)) AS n_paths
ORDER BY min_depth, name
Very cool!
I created a repository to import several ontologies into a compressed Neo4j database. The goal is to allow individuals an easy path to deploy an ultimate Neo4j ontology instance. I hope to add configurations for more ontologies going forward.
I submitted an initial pull request at https://github.com/greenelab/hetontology/pull/1. We'd be delighted if anyone from this repository would be willing to review the pull request and provide feedback or general advice.
Also let us know if there are any specific ways you'd like to be referenced for your great contribution.
Hi, Thank you for you post!
No, we don't allow direct access to neo4j database. However, we hope to expose all necessary data through the API. I put in some work to create a python client for OLS, which is not finished or perfect yet - but should allow people to use OLS with python. However, this client is using the OLS API, so I am not sure if that is what you are looking for.
For people that want more functionality, we hope that the open source code and the documentation is enough to get you started to set up a local instance. Obviously, you managed to do that.
Just from reading your last post (not the code), I am not sure I understand what the goal of your new project is. Isn't OLS (or parts of it) an neo4j ontology instance? So you want it to make it easier for people setting up neo4j, using the OLS ontology 'import'?
I put in some work to create a python client for OLS, which is not finished or perfect yet - but should allow people to use OLS with python
@LLTommy, I think this will be valuable and will be great for people who know python but not cypher. Let me know when it's available! What I like about the public Neo4j instance is the versatility of Cypher (you can perform most hetnet queries efficiently), the diverse language support, and the Neo4j Browser for immediate access.
I am not sure I understand what the goal of your new project is.
I want to provide the following utility, by building on top of the ols-neo4j-app:
hetontology.db.tar.xz
).I think these utilities will serve a growing user base as Cypher and hetnets become major technologies in bioinformatics. I anticipate the project will not take too long, but would really benefit from involvement of any interested OLS contributors -- even if just for code review, feedback, and answering ontological questions. Recognizing the contributions of everyone involved will be a priority of the project.
This sounds cool. I can also make the full OLS neo4j db available on our ftp. This is updated nightly and includes the full set of OBO library ontologies. I also like the Neo4j browser guide idea! Keep us posted on any developments.
I can also make the full OLS neo4j db available on our ftp. This is updated nightly and includes the full set of OBO library ontologies.
That would be awesome and prevent any duplication of effort. I didn't realize you were including ontologies beyond the ones with configuration files.
Keep us posted on any developments.
Will do!
Yes, the config files provided are just some examples for wanting to experiment.
The full OLS system is also able to read in the OBO library YAML config file (http://www.obofoundry.org/registry/ontologies.yml), so we use this to pull in a whole bunch of ontologies on the live site http://www.ebi.ac.uk/ols/ontologies
@simonjupp nice setup. Being able to access the compressed database on the FTP site would be a HUGE convenience!
There's a copy of the neo4j db here for you to try, can you let me know how you get on?
ftp://ftp.ebi.ac.uk/pub/databases/spot/ols/neo4j/
@simonjupp 🎆 !
I'm downloading the database using a shell script:
URL=ftp://ftp.ebi.ac.uk/pub/databases/spot/ols/neo4j/ols-neo4j-29-07-16.tar.gz
mkdir ols-neo4j.db
curl $URL | tar --extract --gzip --strip-components=1 --directory=ols-neo4j.db
I'm on slow wifi, so I haven't completed the download, but I was surprised by the large file size (7.4 GB). When I created an xz-compressed database containing 5 ontologies, the file was only 31 MB. I think your large file size is due to log files that aren't essential for copying the database. For example, I see a lot of neostore.transaction.db
files, which can be deleted if the server isn't running.
When you create the gzip file would it be possible to ignore files that start with neostore.transaction.db
or messages.log
? If you're using the tar
command line utility, there should be an easy way to ignore files that match given patterns. I think that would make the file size considerably smaller.
Good point, we'll have to look into this - and we will.
Rebuilt tarball without the transaction files, it's still 5.7GB. This is over 150 ontologies and some of them are quite hefty.
I'm not sure how much you are constrained by EBI convention for FTP files. My recommendations are to:
Thanks everyone for the help. I'm going to close this issue to keep the issues list clean.
However, we'll make sure to update this Issue with Hetonology progress.
Introduction
I've always found it's been a real pain for computational biologists to interact with ontologies in Python. The ontology community seems focused on java products and the file formats (obo/owl) are super complex. I think in terms of networks and am most comfortable reasoning over ontologies using
networkx
orneo4j
. So when I stumbled upon this presentation, I was super excited about easily loading ontologies into Neo4j.Results
I was amazed how easy was to get the Gene Ontology up and running in Neo4j using this codebase and docker:
Subsequently, the Neo4j server was up and running at http://localhost:7474. I was able to run basic Cypher queries such as finding all subterms of regulation of protein transport (
GO:0051223
):Just wanted to thank the OLS team for this amazing development!
Do you allow public access to the neo4j instances hosted by EBI? For example, we host a public instance of our network for drug repurposing called Hetionet at https://neo4j.het.io.