Yemba language meaning association/group in French
The objective of the project is to federate the metadata of all Cameroonian associations in France to make them more accessible to the community.
Presentation video (in French)
If you want to do data analysis, the raw latest database of cameroonian association is accessible here.
We also maintained a public dashboard to visualize associations here
If you are here, it means that you are interested in an in-house deployment of the solution. Follow the guide :) !
init
and command
scripts from the .gitpod.yml file or use a ready-made development environment on gitpod :Execute filter-cameroon.ipynb
et enrich-database.ipynb
notebooks :
pipenv shell
secretsfoundry run --script 'python filter-cameroon.py'
Finally use the resulting csv file as a data source in Gogocarto and customize it.
You can for example define icons by category (social object); ours are in html/icons
.
These have been built from these basic icons https://thenounproject.com/behanzin777/kit/favorites/
csvdiff ref-rna-real-mars-2022.csv rna-real-mars-2022-new.csv -p 1 --columns 1 --format json | jq '.Additions' > experiments/update-database/diff.csv
python3 main.py
cd etl/
secretsfoundry run --script "chainlit run experiments/ui.py"
devspace deploy
The list of runs runs.csv
has been built by getting all the runs from the beginning using:
export LANGCHAIN_API_KEY=<key>
cd evals/
python3 rag-evals.py save_runs --days 400
Then we use lilac to get the most interesting questions by clustering them per topic/category. "Associations in France" was the one chosen, and we also deleted some rows due to irrelevance.
The clustering repartition is available here: Clustering Repartition
Finally, you just need to do:
export LANGCHAIN_API_KEY=<key>
cd evals/
python3 rag.py ragas_eval tchoung-te --run_ids_file=runs.csv
python3 rag.py deepeval tchoung-te --run_ids_file=runs.csv
Whenever you change a parameter that can affect RAG, you can execute all inputs present in evals/base_ragas_evaluation.csv using langsmith to track them. Then you just have to get the runs and execute above command. As it's just 27 elements, you will be able to compare results manually.
cd etl/
python3 backtesting_prompt.py
Create the dataset on which you want to test the new prompt on langSmith. Then run the file above to backtest and see the results of the new prompt on the dataset. You would specify in the file the name of the dataset before running
Thanks goes to these wonderful people (emoji key):
Ghislain TAKAM ✅ 🔣 |
pdjiela ✅ |
DimitriTchapmi ✅ |
GNOKAM ✅ 🔣 |
fabiolatagne97 ✅ 🔣 |
hsiebenou 🔣 ⚠️ ✅ |
Flomin TCHAWE 💻 ✅ 🔣 |
Bill Metangmo 💻 🔣 🤔 ⚠️ ✅ |
dimitrilexi 🔣 |
ngnnpgn 🔣 |
Tchepga Patrick 🔣 |
This project follows the all-contributors specification. Contributions of any kind welcome!!