Refracting and Improving `extractShorts` and `divideCaptionsIntoChunks` in `fetchresults.js`

1. Update the Data Structure:

Our fetched transcripts object has changed to a new format:

{
  transcript: [
    {
      snippet: "okay in terms of solutions yeah I'm",
      start_time: '0:02'
    }, 
    ...more items
  ]
}

The transcript is an array of objects, each having a snippet and a start_time. So, any function using the old captions format will need to be updated to use this new structure.

2. Dividing Captions into Chunks (Token Limit Consideration):

The primary goal of the divideCaptionsIntoChunks function is to divide the video transcript into sections or "chunks" so that they can be analyzed by the OpenAI API without exceeding the token limit.

Steps for the updated divideCaptionsIntoChunks:

Loop through transcripts.transcript.
Add snippet values to a string until the token limit approaches or is reached.
Keep track of start and end times for each chunk.
Remember, you're aiming for chunks that have meaningful content but don't exceed the OpenAI API's token limit.

3. Analyzing the Chunks:

The purpose of analyzeCaptions is to assess how suitable each chunk is for a short video. The more engaging or relevant a chunk, the higher its score or "rating" should be.

For the updated analyzeCaptions:

It should receive a text chunk and return a rating.
The rating is derived from the OpenAI API, which analyzes the chunk based on its potential as a YouTube short.
You expect the API to return a JSON format like:
```
{ [ { 'start_time:': float, 'end_time': float, 'title': string }, ... ] }
```
This format is helpful for the frontend to display the short videos.

4. Sending Data to the Frontend:

You'll pass the organized and analyzed data from fetchresults.js to app.js, which in turn sends it to the frontend script.js.

Steps:

Once all chunks have been analyzed, you should have a sorted array of top chunks based on their ratings.
Convert this array to the desired JSON format.
app.js should send this data in response to the frontend's POST request.
script.js will then use this data to embed short video clips.

Detailed Steps and Code Changes:

Divide the Captions into Chunks:

Modify the divideCaptionsIntoChunks function to loop over transcripts.transcript:

function divideCaptionsIntoChunks(transcripts) {
let chunks = [];
let currentChunk = [];
let currentTokens = 0;
for (let item of transcripts) {
 const tokens = item.snippet.split(" ").length; // naive token count based on words
 if (currentTokens + tokens > 15000) { // keeping a margin for safety
   chunks.push(currentChunk);
   currentChunk = [item];
   currentTokens = tokens;
 } else {
   currentChunk.push(item);
   currentTokens += tokens;
 }
}
if (currentChunk.length) {
 chunks.push(currentChunk);
}
return chunks;
}

Analyzing the Chunks:
- Since we're aiming for a specific output format, our conversation with the API should ask for that format explicitly. Remember that GPT-3.5 might not always return the exact format you want, so post-processing might be needed.
Update app.js:
- Once you get the shorts from fetchResults.extractShorts, send them as a JSON response:
```
return res.json({ success: true, shorts: topShorts });
```
Update script.js:
- Use the returned shorts data to embed the videos:
```
if (data.success) {
embedVideos(data.shorts);
}
```

Sweetdevil144 / Youtube-Shorts-Creator