Project Proposals - Githubissues

mikeizbicki commented 5 days ago

This issue is for submitting your project proposals. Before class next Wednesday 16 Oct, reply with:

The title
List of group members
2-5 sentence description of what you'll do
List of any references / links you'll use

aopsahl25 commented 5 days ago

LMQL Tutorial
Working alone
For my final project, I will be making a LMQL tutorial in the form of a GitHub repository. The following are topics that I plan to cover in my tutorial: LMQL prompt construction and text generation (including constrained generation), model measuring (results classification and confidence scores), meta prompting, and potentially implementing chatbots with LMQL's interactive generation/results streaming. I will explain each of these topics individually and will also highlight ways that they interact.
References: -- https://lmql.ai/ -- https://www.datacamp.com/tutorial/introduction-to-lmql -- https://arxiv.org/pdf/2212.06094 -- https://medium.com/@abhishekranjandev/lmql-a-deep-dive-into-the-future-of-language-model-interaction-81297cf3ab2c -- https://towardsdatascience.com/lmql-sql-for-language-models-d7486d88c541 -- https://wandb.ai/mostafaibrahim17/ml-articles/reports/Unveiling-LMQL-The-Future-of-Interacting-with-Language-Models--Vmlldzo2NzgzMjcy

KentaWood commented 1 day ago

1.Project Title: Fantasy (Football) Sports Assistant
2.Team: Solo

3.Project Description:
The project is a web app designed as an assistant for fantasy sports leagues, particularly focused on fantasy football. The app will provide personalized player recommendations based on real-time data such as player stats, injuries, and matchups. It will also offer AI-driven insights for drafting strategies and trade suggestions. A feature I was thinking of, was for users ti receive daily or weekly performance summaries and can interact with the assistant via a Q&A feature, helping them make informed decisions to optimize their fantasy teams.

4.Potential APIs and Data Sources:

Yahoo Fantasy Sports API: Yahoo Fantasy Data
ChatGPT API for AI insights: OpenAI API
Sleeper Football Data API: Sleeper Football Data

finnless commented 1 day ago

Title: Build open-text social survey response coder library
Team: Nolan
Project Description: This project is a open source library that uses language models to code open-ended response questions often used in public opinion polling. The target audience for this project social science researches so the library will be easy to use. There will be features for setting a codebook, inputing survey responses, and validating the classifiers accuracy on manually label ground truth test data.
Links:

albert-bpc commented 23 hours ago

Title: Financial Statement Generator Team: Albert Project Description: Use LMQL to take financial data scraped from pdfs and insert the data into an excel model for the 3 financial statements. The target audience for this project are analysts who have to manually move numbers from unstructured documents into excel sheets. There will be a front end where the user can upload a zip file of pdfs and an excel file will be returned. Links: https://lmql.ai/docs/models/azure.html https://azure.microsoft.com/en-us/products/ai-services/ai-document-intelligence https://www.wallstreetprep.com/knowledge/financial-modeling/

RuiZhangg commented 13 hours ago

Title: Course Recommandation Bot
Group: solo
Description: I plan to use the RAG and evaluation we have learned in the earlier part of the course to implement a chat that can respond to your question regarding course selection in 5C. I need to build a database for it, modify the prompts, and create a test set to evaluate its performance.
reference: Topic01 and topic02

EthanTu2 commented 13 hours ago

Title: Medical Machine Translation Research Paper Implementation

Group: Solo

Description: I am currently writing a survey paper on medical machine translation publications since 2021. One of the things I will do for this paper will be done via this project: I plan to use my own data to implement an approach found in one of the research papers I like, experiment/innovate further, compare the results with those found in other papers in the field, and then decide whether the approach should be explored further. I am co-authoring this paper with a professor at CMC.

References: Topic04, Google Scholar, arXiv, PubMed

ains-arch commented 11 hours ago

RAG for Pitzer IT Help Desk
me unless someone is secretly really interested in this topic
RAG system over Pitzer's public IT documentation. Run some kind of scraper over the relevant parts of the Pitzer website, throw it into a RAG system, whip up some quick and dirty chatbot that one could hypothetically use to solve IT-related problems. Figure out if I can compare it to whatever Freshworks' version of this product is (Freddy AI?. Encourage useful results... maybe have it pull from a larger database if the initial response isn't well received. Possibly give up and do something more interesting if I get bored.
pitzercollege.freshservice.com, the rest of the IT section of the Pitzer website, RAG stuff from earlier in the semester

mikeizbicki commented 1 hour ago

These projects look good as-is: @aopsahl25 @finnless @RuiZhangg @albert-bpc

@KentaWood

It's not clear to me where the course-tie-in is. The current way of doing sports analytics type stuff is standard machine learning algorithms like logistic regression. Doing that wouldn't be enough. You'll need to have something incorporating language. One possibility is generating written responses off of the existing analytics, but it's not clear to me how to do this in a way that is actually providing value.
It's fine to incorporate your project into a webapp, but the webapp portion won't earn you credit for the course. The important part for the course is that langauge backend components.
The phrase "AI-driven insights" is a marketing phrase that has no meaning and its use signals that the speaker is non-technical. You should remove it from your vocabulary and in the future be more specific about what those insights are.

@EthanTu2 This doesn't have enough course tie-in. You need to somehow use language models to do something. Some example possibilities include: (1) Use an LLM to implement a translation system. (2) Write a chatbot that let's you chat with the research papers. (3) If you go down the path of implementing another paper, you need to tell me exactly which paper you'll be implementing and what data you'll be using.

@ains-arch I like the idea. To be actually helpful for your given usecase, though, users would probably need to be able to upload screenshots. So think about ways to include these screenshots in the chatbots as well.

mikeizbicki / cmc-csci181-languages

Project Proposals #26