Weekly Report 2024-11-8

Built a main.py file that connects the screenshot command, image-to-text model, and text-to-audio models together. Program now has an initial working version that can do all the basic features.

Some errors can still occur in the program, but they appear intermittent, which is quite confusing. Once these are resolved I will begin focusing on performance, as the current model chain can take over 2 minutes to process a screenshot on my laptop, while the desktop PC handles it in under 30 seconds most of the time.

Etown-CS170 / AI-Sight

Weekly Report 2024-11-8 #2