Closed callahantiff closed 3 years ago
Hello @callahantiff, sorry for the delay! I believe something like that may work, but if it takes more than an hour (and in my experience configuring stuff that is not completely straightforward in Google Cloud tend to combust at some point spontaneously) possibly, we could hard-code some reference URL(s) for the latest version of Pheknowlator into a metadata JSON within Ensmallen that can be updated as necessary.
Would this be a (low-time requirement) sensible solution? Indeed it does not allow automatic retrieval of newly deployed graphs unless they replace the old one at the previous URLs.
One such example is how I am currently handling the kg-hub graphs since they still have some oddities.
Thanks so much @LucaCappelletti94. This sounds totally reasonable. I will take a closer look this weekend and follow-up once I have a better sense. Thank you!
Hello @callahantiff, I am doing a round of updates to the graph retrieval, any news on the PheKnowLator graph availability? I saw it is offered in owl, but we do not currently support the OWL format.
Hey @LucaCappelletti94 -
Thanks for circling back with me. We would love to be included and we do provide data in a format other than OWL. Information on all of the output files produced, including a snippet of the output, can be found here: https://github.com/callahantiff/PheKnowLator/wiki/KG-Construction#table-knowledge-graph-build-output. Per our conversation on Slack, I think the files for each build that will work the best and be the easiest to use with existing infrastructure will be the XXXX_Triples_Identifiers.txt
.
There are two other updates that I wanted to give you below.
There is a JSON file called pheknowlator_builds.json
that can serve as a proxy for a directory listing. It gets updated each month and can be accessed from the following URL: https://storage.googleapis.com/pheknowlator/pheknowlator_builds.json.
Currently, there is an entry for metadata, and then one for each monthly build/release (ordered temporally), where each kg build is referenced by key within each monthly build (hope that makes sense -- additional information on how the 12 builds differ is shown below). Note that if a particular file was not available for a monthly build/release, it will be noted with the value null
. A snippet of the output is shown below. Let me know if you see any problems with this structure, it's very easy to make changes!
{
"metadata": "For more information on the PheKnowLator Builds, please visit the project GitHub: https://github.com/callahantiff/PheKnowLator",
"v2.0.0-2020-5-10": {
"instance-inverseRelations-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/instance_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_Instance_inverseRelations_noOWL_Triples_Identifiers.txt",
"instance-inverseRelations-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/instance_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_Instance_inverseRelations_noOWL_Triples_Identifiers.txt",
"instance-inverseRelations-owlnets-purified": null,
"instance-relationsOnly-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_Instance_relationsOnly_noOWL_Triples_Identifiers.txt",
"instance-relationsOnly-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_Instance_relationsOnly_noOWL_Triples_Identifiers.txt",
"instance-relationsOnly-owlnets-purified": null,
"subclass-inverseRelations-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/subclass_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_subclass_inverseRelations_noOWL_Triples_Identifiers.txt",
"subclass-inverseRelations-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/subclass_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_subclass_inverseRelations_noOWL_Triples_Identifiers.txt",
"subclass-inverseRelations-owlnets-purified": null,
"subclass-relationsOnly-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/subclass_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_subclass_relationsOnly_noOWL_Triples_Identifiers.txt",
"subclass-relationsOnly-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_10MAY2020/knowledge_graphs/subclass_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_subclass_relationsOnly_noOWL_Triples_Identifiers.txt",
"subclass-relationsOnly-owlnets-purified": null
},
"v2.0.0-2021-1-25": {
"instance-inverseRelations-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_instance_inverseRelations_noOWL_Triples_Identifiers.txt",
"instance-inverseRelations-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_instance_inverseRelations_noOWL_Triples_Identifiers.txt",
"instance-inverseRelations-owlnets-purified": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_instance_inverseRelations_noOWL_INSTANCE_purified_Triples_Identifiers.txt",
"instance-relationsOnly-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_instance_relationsOnly_noOWL_Triples_Identifiers.txt",
"instance-relationsOnly-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_instance_relationsOnly_noOWL_Triples_Identifiers.txt",
"instance-relationsOnly-owlnets-purified": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_instance_relationsOnly_noOWL_INSTANCE_purified_Triples_Identifiers.txt",
"subclass-inverseRelations-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_subclass_inverseRelations_noOWL_Triples_Identifiers.txt",
"subclass-inverseRelations-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_subclass_inverseRelations_noOWL_Triples_Identifiers.txt",
"subclass-inverseRelations-owlnets-purified": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/inverse_relations/owlnets/PheKnowLator_v2.0.0_full_subclass_inverseRelations_noOWL_SUBCLASS_purified_Triples_Identifiers.txt",
"subclass-relationsOnly-owl": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_subclass_relationsOnly_noOWL_Triples_Identifiers.txt",
"subclass-relationsOnly-owlnets": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_subclass_relationsOnly_noOWL_Triples_Identifiers.txt",
"subclass-relationsOnly-owlnets-purified": "https://storage.googleapis.com/pheknowlator/archived_builds/release_v2.0.0/build_25JAN2021/knowledge_graphs/subclass_builds/relations_only/owlnets/PheKnowLator_v2.0.0_full_subclass_relationsOnly_noOWL_SUBCLASS_purified_Triples_Identifiers.txt"
},
... }
subclass-based
(examples here) and instance-based
(examples here). An example of the difference is shown in the image below.
relations_only
) or by inferring the inverse of the relation, which we accomplish using two strategies (i.e., inverse_relations
) -- see example below. For more details and examples, see here.
_OWLNETS_
in the file name and the original full OWL file with be noted with OWL
in the filename (for additional information see here). _SUBCLASS_purified_
or _INSTANCE_purified_
in the file names. For example, if the build is instance
-based, then all rdfs:subClassOf
relations are converted to rdf:type
and for all triples where an rdfs:subClassOf
relation occurred we add rdf:type
relations between the object of this triple and all of its ancestors. For a subclass
-based build, we implement the same procedure but replace all occurrences of rdf:type
with rdfs:subClassOf
.By varying the different combinations of the Construction Approach, Relation Strategy, and Property Graph Abstraction you end up with the following 12 KGs:
Awesome @LucaCappelletti94 thanks for your help with this. I will close this issue, but please feel free to re-open if we need to do more work here.
Task
Add functionality the enable the listing of objects within a Google GCS bucket. There are some interesting solutions proposed on the web to account for the lack of native functionality.
Potential Solution
One such solution essentially creates a listing of the items in a bucket and exposes it as a web service. I'm not sure that this is the best way to go, @LucaCappelletti94, if I were to implement something similar to this would that meet your requirements? Can you describe to me a bit more what functionality you are looking for and I will make sure to add it? 😄 🤔 ðŸ’
Thank you!