better chatbot: vision-fed chatbot

Vandivier commented 7 months ago

the ladderly custom GPT is built on transcript info

let's add machine vision where it takes images from high value videos and makes useful summaries from them

Vandivier commented 7 months ago

if we sandbag long enough maybe someone will make a video reader instead of the existing image reader approach

still, image reader is a big win over not having it; lots of videos with no transcript bc they are an image, meme, screenshot, or text wall

Vandivier commented 1 day ago

all the video to text tools are still using glorified transcription

the low hanging fruit approach here is to take top-performing no-voice videos and write up the takeaways myself

the long term harder but high value approach is to make a tool that is able to take a screenshot as variable cadence (every 1, 2, 3, 10, or 30s...I doubt longer than 30s will be useful. It could be useful to have a list of user-defined time stamps though)

then, snap an image at those time stamps and apply an image-to-text by ai tool

that should really be its own saas tbh and it feels like smth that requires background job operation rather than being a serverless thing, i could be wrong tho

Vandivier / ladderly-3

better chatbot: vision-fed chatbot #165