SingDict is a dictionary of Singlish terms built upon definitions as generated by ChatGPT.
Raw data used for our dictionary is stored in the data/dictionary
directory with the subdirectory names as the source of the vocabulary.
0_words.txt
contains the words extracted from the respective sources1_descriptions.jsonl
contains ChatGPT's responses for the entries of those wordsThe vocabulary
directory contains the human evaluated versions of the dictionary entries taken from an excel sheet which contains our work in progress. This directory is updated in batches according to when the script is run.
The scripts
directory contains the scripts to extract the words from the websites and to extract the evaluated entries from google sheets.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.