Open jamesfebin opened 1 week ago
Hi @jamesfebin - OntoGPT can do this, and your template is a great start - it just needs some more details for the LLM to work with.
(The imports should also include core
as this defines the main OntoGPT types)
So if the input text is something like this:
In a surprise move, the city council of Oakdale voted to approve a new development project led by prominent businesswoman, Emily-Jane Lee. The project, which will bring a new shopping center and several restaurants to the downtown area, has been met with both excitement and skepticism from local residents. Council members, including Chairperson Maria Rodriguez, Vice Chair John Michael Davis Jr., and Councilor Sofia Patel, cited the potential economic benefits and job creation as key factors in their decision. However, some residents, such as longtime Oakdale resident and activist, Ava Morales, have expressed concerns about the impact on traffic and local small businesses. Despite these concerns, project investor, Julian Styles, remains confident that the development will be a success and a boon to the community.
Then a template like this should work:
id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:
linkml: https://w3id.org/linkml/
imports:
- linkml:types
- core
default_range: string
classes:
Container:
tree_root: true
attributes:
persons:
description: >-
A semicolon-delimited list of people named in the text.
multivalued: true
inlined_as_list: true
range: Person
Person:
description: >-
A person.
attributes:
full_name:
description: >-
The full name of the person.
range: string
Run something like ontogpt extract -t personinfo.yaml -i input.txt
and you should get a result like:
---
input_text: In a surprise move, the city council of Oakdale voted to approve a new
development project led by prominent businesswoman, Emily-Jane Lee. The project,
which will bring a new shopping center and several restaurants to the downtown area,
has been met with both excitement and skepticism from local residents. Council members,
including Chairperson Maria Rodriguez, Vice Chair John Michael Davis Jr., and Councilor
Sofia Patel, cited the potential economic benefits and job creation as key factors
in their decision. However, some residents, such as longtime Oakdale resident and
activist, Ava Morales, have expressed concerns about the impact on traffic and local
small businesses. Despite these concerns, project investor, Julian Styles, remains
confident that the development will be a success and a boon to the community.
raw_completion_output: 'persons: Emily-Jane Lee; Maria Rodriguez; John Michael Davis
Jr.; Sofia Patel; Ava Morales; Julian Styles;'
prompt: |+
Split the following piece of text into fields in the following format:
full_name: <The full name of the person.>
Text:
Julian Styles
===
extracted_object:
persons:
- full_name: Emily-Jane Lee
- full_name: Maria Rodriguez
- full_name: John Michael Davis Jr.
- full_name: Sofia Patel
- full_name: Ava Morales
- full_name: Julian Styles
Thank you, @caufieldjh I am able to generate the yaml file.
However, I get the following when I use it for OWL format. And it doesn't generate a valid .owl file.
INFO:root:Output format: owl
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: Container == Person owning: Container
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: Container == Person owning: Container
INFO:linkml.generators.pythongen:FALSE: OCCURS BEFORE: Any == Any owning: ExtractionResult
INFO:linkml.generators.pythongen:FALSE: OCCURS BEFORE: Any == Any owning: ExtractionResult
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: TextWithTriples == Publication owning: TextWithTriples
INFO:linkml.generators.pythongen:FALSE: OCCURS BEFORE: Triple == Triple owning: TextWithTriples
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: TextWithTriples == Publication owning: TextWithTriples
INFO:linkml.generators.pythongen:FALSE: OCCURS BEFORE: Triple == Triple owning: TextWithTriples
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: TextWithEntity == Publication owning: TextWithEntity
INFO:linkml.generators.pythongen:TRUE: OCCURS SAME: TextWithEntity == Publication owning: TextWithEntity
INFO:root:Subject=None
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='Emily-Jane Lee')]
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='Maria Rodriguez')]
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='John Michael Davis Jr.')]
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='Sofia Patel')]
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='Ava Morales')]
INFO:root:Subject=None
INFO:root:Cannot determine axiom type for full_name, unprocessed=[Literal(v='Julian Styles')]
INFO:root:Cannot determine axiom type for persons, unprocessed=[]
Generating OWL requires a few more format-specific details so the OWL interpreter knows how to define relationships the LinkML format doesn't identify. Try this:
id: https://w3id.org/linkml/examples/personinfo
name: personinfo
prefixes:
linkml: https://w3id.org/linkml/
personinfo: https://w3id.org/linkml/examples/personinfo/
imports:
- linkml:types
- core
default_range: string
default_prefix: personinfo
classes:
Container:
tree_root: true
attributes:
persons:
description: >-
A semicolon-delimited list of people named in the text.
multivalued: true
inlined_as_list: true
annotations:
owl: ObjectProperty, ObjectSomeValuesFrom
range: Person
Person:
is_a: NamedEntity
description: >-
A person.
attributes:
full_name:
description: >-
The full name of the person.
range: string
id:
description: >-
A unique identifier for the person.
This is their full name without spaces
or special characters.
identifier: true
range: string
That should generate OWL like this:
Prefix( owl: = <http://www.w3.org/2002/07/owl#> )
Prefix( rdf: = <http://www.w3.org/1999/02/22-rdf-syntax-ns#> )
Prefix( rdfs: = <http://www.w3.org/2000/01/rdf-schema#> )
Prefix( xsd: = <http://www.w3.org/2001/XMLSchema#> )
Prefix( xml: = <http://www.w3.org/XML/1998/namespace> )
Prefix( linkml: = <https://w3id.org/linkml/> )
Prefix( personinfo: = <https://w3id.org/linkml/examples/personinfo/> )
Prefix( shex: = <http://www.w3.org/ns/shex#> )
Prefix( schema: = <http://schema.org/> )
Prefix( NCIT: = <http://purl.obolibrary.org/obo/NCIT_> )
Prefix( RO: = <http://purl.obolibrary.org/obo/RO_> )
Prefix( biolink: = <https://w3id.org/biolink/vocab/> )
Prefix( core: = <http://w3id.org/ontogpt/core/> )
Ontology( <https://w3id.org/linkml/examples/personinfo>
AnnotationAssertion( rdfs:label personinfo:EmilyJaneLee "Emily-Jane Lee" )
AnnotationAssertion( rdfs:label personinfo:MariaRodriguez "Maria Rodriguez" )
AnnotationAssertion( rdfs:label personinfo:JohnMichaelDavisJr "John Michael Davis Jr" )
AnnotationAssertion( rdfs:label personinfo:SofiaPatel "Sofia Patel" )
AnnotationAssertion( rdfs:label personinfo:AvaMorales "Ava Morales" )
AnnotationAssertion( rdfs:label personinfo:JulianStyles "Julian Styles" )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:EmilyJaneLee ) )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:MariaRodriguez ) )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:JohnMichaelDavisJr ) )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:SofiaPatel ) )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:AvaMorales ) )
SubClassOf( None ObjectSomeValuesFrom( personinfo:persons personinfo:JulianStyles ) )
)
Thank you again, @caufieldjh.
However, when I import this on Protege or another owl visualizer, I get an error. Can you point me to any document or resources so I can study and solve these issues myself? (How to go about writing yaml file to generate owl data models)
Hi @jamesfebin, OntoGPT uses LinkML tools for generating OWL (and other serializations) so you may find these docs helpful: https://linkml.io/linkml/generators/owl.html
Remember generators are for schema conversion. Ontogpt uses linkml-owl for data conversion
On Sun, Nov 10, 2024 at 1:33 PM Harry Caufield @.***> wrote:
Hi @jamesfebin https://github.com/jamesfebin, OntoGPT uses LinkML tools for generating OWL (and other serializations) so you may find these docs helpful: https://linkml.io/linkml/generators/owl.html
— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/ontogpt/issues/471#issuecomment-2466935956, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOLSC4B5ZTZIHJHLGCLZ77GKBAVCNFSM6AAAAABRM644AKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRWHEZTKOJVGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I am trying to use OntoGPT in a domain outside of bioinformatics. Presently trying something simple like extracting names of people from a given text.
I have a dumb question.
The values are pre-defined in most of the templates I have seen (Ex: vbo_names). So, when I try to modify and use the template, though it's a valid LinkML file, OntoGPT doesn't add them to OWL like only the last value in a list of people's names is added. And it gives errors like
Custom Template I made.
Is there a template that's a bit generic I can use in this case?