CDLUC3 / data-curation

Exploratory project to Our goal is to catalog and evaluate datasets. We will determine ways to evaluate data files against the indicators above and offer solutions for increasing their quality. We aim to translate best practices into workflows that help with everyday use cases.
0 stars 0 forks source link

Gemini API access from Colab #16

Open sfisher opened 1 week ago

sfisher commented 1 week ago

Steve was able to get ChatGPT to parse a data file, give feedback and visualize it.

How can we automate that in a coding environment? (Do we want to investigate Gradio, which is an interface for working with ML ?)

To discover (probably a lot about how the API works):

sfisher commented 2 days ago

I have a demo using Gradio and with Google Vertex Studio which is a design studio they make available for creating queries and prompt engineering testing. It also offers a simpler way and example code that is higher level than their API.

I was able to have it accept a CSV file and make suggestions for improving data quality. When I asked it to visualize the results it simply gave me python code to run (some of which didn't run right, but most did), which seems to make it harder to create auto visualizations of any csv structure.