Dataherald / dataherald

Interact with your SQL database, Natural Language to SQL using LLMs
https://dataherald.readthedocs.io/en/latest/
Apache License 2.0
3.35k stars 234 forks source link

fix sorting table relevance scores #515

Closed daniel309 closed 4 months ago

daniel309 commented 4 months ago

This fixes sorting table relevance scores in the output.

Previously:

engine   | Action: DbTablesWithRelevanceScores
engine   | Action Input: what is the total number of albums
engine   | Observation: Table: `main.media_types`, relevance score: 0.0771
engine   | Table: `main.customers`, relevance score: 0.0827
engine   | Table: `main.employees`, relevance score: 0.1003
engine   | Table: `main.invoices`, relevance score: 0.1282
engine   | Table: `main.invoice_items`, relevance score: 0.1407
engine   | Table: `main.genres`, relevance score: 0.1568

Now:

engine   | Action: DbTablesWithRelevanceScores
engine   | Action Input: what is the total number of albums
engine   | Observation: Table: `main.albums`, relevance score: 0.3576
engine   | Table: `main.artists`, relevance score: 0.2226
engine   | Table: `main.playlist_track`, relevance score: 0.1908
engine   | Table: `main.playlists`, relevance score: 0.1676
engine   | Table: `main.genres`, relevance score: 0.1568
daniel309 commented 4 months ago

The tests that run are green. So I am not making things worse ;-)

Resolving the warnings about test classes having __init__ contructors should be addressed in another PR.

docker-compose exec engine pytest
================================================================== test session starts ===================================================================
platform linux -- Python 3.11.4, pytest-7.4.0, pluggy-1.5.0
rootdir: /app
configfile: pyproject.toml
plugins: anyio-4.4.0, dotenv-0.5.2
collected 1 item

dataherald/tests/test_api.py .                                                                                                                     [100%]

==================================================================== warnings summary ====================================================================
../usr/local/lib/python3.11/site-packages/httpx/_client.py:680
  /usr/local/lib/python3.11/site-packages/httpx/_client.py:680: DeprecationWarning: The 'app' shortcut is now deprecated. Use the explicit style 'transport=WSGITransport(app=...)' instead.
    warnings.warn(message, DeprecationWarning)

dataherald/tests/db/test_db.py:8
  /app/dataherald/tests/db/test_db.py:8: PytestCollectionWarning: cannot collect test class 'TestDB' because it has a __init__ constructor (from: dataherald/tests/db/test_db.py)
    class TestDB(DB):

dataherald/tests/evaluator/test_eval.py:10
  /app/dataherald/tests/evaluator/test_eval.py:10: PytestCollectionWarning: cannot collect test class 'TestEvaluator' because it has a __init__ constructor (from: dataherald/tests/evaluator/test_eval.py)
    class TestEvaluator(Evaluator):

dataherald/tests/sql_generator/test_generator.py:11
  /app/dataherald/tests/sql_generator/test_generator.py:11: PytestCollectionWarning: cannot collect test class 'TestGenerator' because it has a __init__ constructor (from: dataherald/tests/sql_generator/test_generator.py)
    class TestGenerator(SQLGenerator):

dataherald/tests/vector_store/test_vector_store.py:9
  /app/dataherald/tests/vector_store/test_vector_store.py:9: PytestCollectionWarning: cannot collect test class 'TestVectorStore' because it has a __init__ constructor (from: dataherald/tests/vector_store/test_vector_store.py)
    class TestVectorStore(VectorStore):

dataherald/tests/test_api.py::test_heartbeat
  /app/dataherald/tests/conftest.py:9: RemovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    engine.execute(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================= 1 passed, 6 warnings in 2.22s ==============================================================