neuml / txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
https://neuml.github.io/txtai
Apache License 2.0
8.82k stars 580 forks source link

What are the API endpoints to use semantic graph ? #394

Closed akset2X closed 1 year ago

akset2X commented 1 year ago

Glad to know that txtAI brought Semantic graph as it's new feature. By the way how to actually use it if we have other language programmes and we expect it as an API ?.

Are there any API endpoints out like search and extract? Anybody using it ? Please let me know. It will be better(atleast for me) if we get the details on how to write the Python script into a yaml config file to use graphs, categories and topic modeling.

Following is my config file content to start the server using uvicorn..

# Index file path
path: ./tmp/index

# Allow indexing of documents
writable: True

# Enbeddings index
embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  content: true
# I manually added these below lines upto extractor part
  functions:
  - name: graph
    function: graph.attribute
  expressions:
  - name: category
    expression: graph(indexid, 'category')
  - name: topic
    expression: graph(indexid, 'topic')
  - name: topicrank
    expression: graph(indexid, 'topicrank')
  graph:
    limit: 15
    minscore: 0.1
    topics:
      categories:
      - Society & Culture
      - Science & Mathematics
      - Health
      - Education & Reference
      - Computers & Internet
      - Sports
      - Business & Finance
      - Entertainment & Music
      - Family & Relationships
      - Politics & Government

extractor:
  path: distilbert-base-cased-distilled-squad

textractor:
    paragraphs: true
    minlength: 100
    join: false

Output:

...
ModuleNotFoundError: No module named 'graph' 
ERROR: Application startup failed. Exiting
davidmezzetti commented 1 year ago

There are no direct endpoints for semantic graphs but they can be accessed with workflows. You can either create a custom pipeline or use a SQL statement to access the graph for search.

This notebook is an example on running SQL statements with workflows - https://colab.research.google.com/github/neuml/txtai/blob/master/examples/28_Push_notifications_with_workflows.ipynb#scrollTo=_B4YFu-1R2QC

davidmezzetti commented 1 year ago

And I'll have to look more into the error you're seeing with the YAML syntax.

akset2X commented 1 year ago

Thank you for the update. Well, I'm not sure whether the .yml content that I have attached in my first comment is correct or not, could you please verify and mention what I'm missing?

BTW following is the complete Error log that I got while executing the config using uvicorn,

INFO:     Started server process [22240]
2022-12-08 22:06:56,331 [INFO] serve: Started server process [22240]
INFO:     Waiting for application startup.
2022-12-08 22:06:56,331 [INFO] startup: Waiting for application startup.
ERROR:    Traceback (most recent call last):
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 635, in lifespan
    async with self.lifespan_context(app):
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 530, in __aenter__
    await self._router.startup()
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 614, in startup
    handler()
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\application.py", line 67, in start
    INSTANCE = Factory.create(config, api) if api else API(config)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\base.py", line 18, in __init__
    super().__init__(config, loaddata)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 73, in __init__
    self.indexes(loaddata)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 164, in indexes
    fn["function"] = self.function(fn["function"])
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 255, in function
    return PipelineFactory.create({}, function)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\pipeline\factory.py", line 52, in create
    pipeline = PipelineFactory.get(pipeline)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\pipeline\factory.py", line 36, in get
    return Resolver()(pipeline)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\util\resolver.py", line 27, in __call__
    m = __import__(module)
ModuleNotFoundError: No module named 'graph'

2022-12-08 22:06:59,540 [ERROR] send: Traceback (most recent call last):
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 635, in lifespan
    async with self.lifespan_context(app):
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 530, in __aenter__
    await self._router.startup()
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\starlette\routing.py", line 614, in startup
    handler()
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\application.py", line 67, in start
    INSTANCE = Factory.create(config, api) if api else API(config)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\api\base.py", line 18, in __init__
    super().__init__(config, loaddata)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 73, in __init__
    self.indexes(loaddata)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 164, in indexes
    fn["function"] = self.function(fn["function"])
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\app\base.py", line 255, in function
    return PipelineFactory.create({}, function)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\pipeline\factory.py", line 52, in create
    pipeline = PipelineFactory.get(pipeline)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\pipeline\factory.py", line 36, in get
    return Resolver()(pipeline)
  File "c:\users\ak\appdata\local\programs\python\python39\lib\site-packages\txtai\util\resolver.py", line 27, in __call__
    m = __import__(module)
ModuleNotFoundError: No module named 'graph'

ERROR:    Application startup failed. Exiting.
2022-12-08 22:06:59,541 [ERROR] startup: Application startup failed. Exiting.
davidmezzetti commented 1 year ago

I think it might be a bug with using the config and applications. When I have time to take a look, I'll respond back.

davidmezzetti commented 1 year ago

This has been fixed in the main branch. It will be released with txtai 5.2.

davidmezzetti commented 1 year ago

This issue was resolved with txtai 5.2.