NaXzyu / Dialogos

Dialogos: Pioneering Interactive Narratives and Language Proficiency with Enhanced AI in Unity
MIT License
3 stars 1 forks source link

Benchmarking Lexer Against Hugging Face Transformer #12

Open p3nGu1nZz opened 6 months ago

p3nGu1nZz commented 6 months ago

Benchmarking Lexer Against Hugging Face Transformer

Objective:

To evaluate the performance and effectiveness of our custom Lexer in comparison to the Hugging Face transformer, we will create a benchmark that measures speed, memory usage, and the quality of context representation.

Tasks:

Acceptance Criteria:

This ticket will guide the development of a comprehensive benchmarking suite that will inform our decision-making process regarding text processing tools within our project.

Josephrp commented 6 months ago

i want to follow along with this but dont know how much i can help ^^

p3nGu1nZz commented 6 months ago

i want to follow along with this but dont know how much i can help ^^

you could make a simple python script to tokenize a string of words (using huggingface transformers) no more than 1000 characters. And track how long it takes to tokenize that string as accurately as possible.