Closed AaronSosaRamos closed 3 weeks ago
Was checking this out and found a similar issue with the Quota. On investigating this in the STAGING branch, I found the reason for it was because of the way the Map Reduce algorithm works
Essentially, it is sending a batch request on the N chunks of the transcript which actually produces the latency. This will require an entirely different approach actually where I think the way we can solve this is by a single large request rather than an iterative request due to rate limits
Due to that, I will not merge this for now into STAGING but rather make a new branch and try to solve this with using Gemini 1.5 Pro with a its 1M context window.
Since gemini-1.5-flash
came out, we can try to experiment with that in as little requests as possible while not losing information in a two-layer process
SUMMARIZE -> EXTRACT
This PR is made for analyzing and implementing a new approach for solving Issue #23 related to Dynamo. When I started using the system, I tested this 9-minutes video: https://www.youtube.com/watch?v=rWcG-p1oQe0 and I received the Quota Exceeded error after retrying it for 5 attempts:
As a result, I tested another video that has a duration of 1 minute: https://www.youtube.com/watch?v=1aA1WGON49E, and it worked with the actual approach.
For that reason, what I've done is to refactor the chunk_size and chunk_overlap for managing appropriately it in base of the length of the video, considering the threshold of 3 minutes for it:
Then, I've tested both videos and they worked appropriately.
I also used this 5-minutes video: https://www.youtube.com/watch?v=iDbdXTMnOmE, and it also worked.
Indeed, I've tested it 4 times and the parser hasn't failed in any of those attempts.
What I've discovered is that the long latency is caused by the
summarize_transcript
method, specially if the video has an extense duration (9 to 10 minutes). For now, we can be working in this way. But I suggest to optimize thesummarize_transcript
method in base of how we manageload_summarize_chain
because the latency is produced by this.