Closed poneoneo closed 1 month ago
Thank you for highlighting these concerns. We understand the challenges you are facing, and we want to address each issue:
Inconsistent DataFrames: We recognize that DataHorse, being powered by a large language model (LLM), can return different results for the same query due to the model's dynamic nature. To mitigate this, we are working on standardizing the output schema by defining specific constraints and filters to ensure more consistent DataFrame results. We are also implementing a post-processing step to align the DataFrame with expected columns and structure.
Inconsistent Plot Renderings: Similar to the data inconsistency, the varying plots are a result of the LLM’s flexible interpretation of queries. We are establishing a set of standard plot types and formats for each query to ensure uniformity. Additionally, we will include a normalization step in our post-processing to standardize the plots generated.
Library Dependency Issues: We are aware of the import errors caused by missing libraries. To address this, we will update our documentation to include all necessary dependencies and ensure they are properly managed in our environment. We are also working on handling missing libraries more gracefully by implementing conditional imports.
@SsebowaDisan @poneoneo I reckon setting the seed will generate similar or in some cases the same result which may help in solving the issue here for the llama-8b-8192 Reference used - https://medium.com/@kyfex/groqs-chat-settings-dfbb601efdca Should I try on the changes needed?
Hello @Sohammhatre10 , Thank you for the suggestion about setting the seed! It does sound like a great approach for ensuring more consistent results. I appreciate your willingness to contribute to the project.
If you’d like to help with this, feel free to dive into the implementation based on the reference you shared.
Looking forward to collaborating with you on this!
Best regards, Disan
I think it's could be great to provide endpoints for user and allow them to change it based on their needs
@poneoneo That'll be great for personalization.
Have tried to clear this using a different caching mechanism and seeding technique. Works well here https://github.com/DeDolphins/DataHorse/pull/9 @SsebowaDisan @poneoneo Hope this solves the issue!
I decided to integrate datahorse into my project as an ai-agent. So for each query a csv file is provided to data-horse but sometime for the same query i'm often facing different scenario:
bedhind the scene i decided to add "always return the result as dataframe" to reduce the occurence amount of all errors above mentionned.
So how to fix result or plot render or that datahorse try to import not available library?