osscameroon / project-ideas

A list of project ideas
22 stars 2 forks source link

Camfranglais-LLM #42

Open Zaker237 opened 1 year ago

Zaker237 commented 1 year ago

Introduction

A lite Language Model that can understand our language(Camfranglais)

Description

Goal

The goal of the project is to build an (hopfully train) a language model that can understand Camfranglais so that the model can be used in the future for translations (for example into pure English or pure French).

Process

I don't have the whole process in mind but my idea is the following:

Why This project??

Relevant Technology

For the Technology, it will mainly be Python with the following librairies:

and also some Web Technologies

Complexity

Required time

Categories

github-actions[bot] commented 1 year ago

It's great having you contribute to this project

Welcome to the community :nerd_face:, we will carefully review your project idea and get back to you.

If you would like to follow our community's work you should join us on our Telegram chat group and Channel, we help and encourage each other to contribute to open source.
You can also support us financially here to help us build Cameroon one open source at a time.

billmetangmo commented 1 year ago

Good fun work to do @Zaker237. I can be interested BTW, I don't think it's necessary at least for preliminary versions (v0.x) to train a new model.

Why ? Because Camfranglais is a mix of more of french/english words then dialects ( don't know real proportion however). Cosindering my assumption is true and as Chatgpt can easily undertand french/english mix: 2023-04-12

and knows camfranglais but do some errors: image

The easisest way should be to provide it a dictionnary like this one from valery ndongo https://docsend.com/view/avvk5ef9qpvzy5zd as a context to chatgpt like below:

1xzt5jn

THIS COULD BE A GOOD USE CASE FOR CHATGPT PLUGIN. DOES SOMEONE ALREADY HAVE ACCESS TO ?

Zaker237 commented 1 year ago

@billmetangmo That dictionnary from valery ndongo is indeed a great ressource for this project. thanks I didn't know it