connortbot / squeak

Learn languages through stories you will love. (https://squeak.today/)
https://connortbot.github.io/squeak/
3 stars 0 forks source link

News articles by subject w/ Command-R (backend) #5

Open connortbot opened 4 days ago

connortbot commented 4 days ago

description

Add "news articles" using Command-R+ (08-2024) to back lambda generation and accessibility to REST endpoint.

details

New file structure for backend S3

french/
-> B1/
--> Basketball/
---> News/
----> B1_News_Basketball_2024_11_09.json
----> B1_News_Basketball_2024_11_09.json
---> Story/
----> B1_Story_Basketball_2024_11_09.json
----> B2_Story_Basketball_2024_11_09.json
connortbot commented 3 days ago

on god this is NOT possible

tested with this:

// known problem with runaway responses when Command-R+ is expected to web search, generate, and translate.
// fixed if prompt and message are in the proper language
func generateNewsArticle(language string, cefr string, query string) (v1Response, error) {
    cohereAPIKey := os.Getenv("COHERE_API_KEY")
    emptyResponse := v1Response{}
    if cohereAPIKey == "" { return emptyResponse, errors.New("ERR: COHERE_API_KEY environment variable not set") }

    startingMessage := fmt.Sprintf("LANGUAGE: %s\nCEFR: %s\nSEARCH QUERY: %s", language, cefr, query)
    randomIndex := rand.Intn(len(sources))
    randomSource := sources[randomIndex]
    coherePayload := map[string]interface{}{
        "chat_history": []map[string]string{
            {
                "role": "system",
                "message": strings.TrimSpace(`
                    You are Squeak, an LLM designed to write news articles in certain languages as language learning content.

                    You will be provided a language, CEFR level, and search query. 

                    First, search the web for news stories. DO NOT MODIFY THE SEARCH QUERY FROM WHAT WAS PROVIDED.
                    To write the article, you can only use sources that are from today's date.

                    Then, with the found information, you must write a news article in the provided language. The requirements for the news article are as follows:
                    - The news article MUST cover a specific event or story relevant to today and use sources only dated to today.
                    - The news article must be written in the provided LANGUAGE.
                    - The article must be written and formatted as a news article.
                    - Write a story that matches the reading difficulty of the provided CEFR level.

                    You should provide the story without preamble or other comment.`),
            },
        },
        "message": startingMessage,
        "connectors": []map[string]interface{}{
            {
                "id": "web-search",
                "options": map[string]string{
                    "site": randomSource,
                },
            },
        },
    }

    log.Println("Marshalling Cohere Payload")
    jsonData, err := json.Marshal(coherePayload)
    if err != nil { return emptyResponse, err }

    req, err := http.NewRequest("POST", "https://api.cohere.com/v1/chat", bytes.NewBuffer(jsonData))
    if err != nil { return emptyResponse, err }

    req.Header.Set("Accept", "application/json")
    req.Header.Set("Content-Type", "application/json")
    req.Header.Set("Authorization", "Bearer " + cohereAPIKey)

    log.Println("Comitting REQ")
    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil { return emptyResponse, err }
    defer resp.Body.Close()
    log.Println("RECEIVE RESPONSE")

    body, err := io.ReadAll(resp.Body)
    if err != nil { return emptyResponse, err }

    var result v1Response
    if err := json.Unmarshal(body, &result); err != nil {
        log.Println(string(body)) // e.g improper model id (should add better reaction)
        return emptyResponse, err
    }
    log.Println("Result")

    // log.Println(string(body))
    log.Println(result.Text)
    for i := range len(result.Documents) {
        log.Println(result.Documents[i].URL)
    }
    return result, nil
}

func main() {
    generateNewsArticle("French", "B1", "today politics news")
}

and it always breaks one of the following requirements:

I think it could be done with a chained LLM call 3 times but it's a lot of work and prompt engineering for something that might not even work

connortbot commented 2 days ago

moved to feature/articles squeak-#5

connortbot commented 1 day ago

@Saai151 I finished backend news generation and I'm gonna change the back lambda to generate stories properly and everything (i.e #4 ) so could you handle the front lambda? (like the REST API endpoint)

So the REST API needs to be updated to take:

connortbot commented 15 hours ago

backend news gen done 4b18ec44c292cd56b21ad870970c9e75e87ac399 back lambda code only does

languages := []string{"French"}
language_ids := map[string]string{
        "French": "fr",
}
cefrLevels := []string{"B1", "B2"}
subjects := []string{"Canada", "World", "Business", "Investing", "Politics", "Sports", "Arts"}