About Tokenizer.ipynb of Chapter 9 in the second edition

Denis2054 / Transformers-for-NLP-2nd-Edition

Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL-E including jump starting GPT-4, speech-to-text, text-to-speech, text to image generation with DALL-E, Google Cloud AI,HuggingGPT, and more

MIT License

807 stars 306 forks source link

Thank you for this feedback. Here is the explanation and the solution:

Explanation a) the libraries, modules, and packages continually change, which might impact the code b) AI NLP algorithms are stochastic, meaning that there is randomness in the outputs
Solution: Run the more recent version which is Tokenizers.ipynb in the Transformers for NLP and CV, 3rd Edition repository: https://colab.research.google.com/github/Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition/blob/main/Chapter10/Tokenizers.ipynb

It is open-source, thus free, and also it is self-contained.

Denis2054 / Transformers-for-NLP-2nd-Edition

About Tokenizer.ipynb of Chapter 9 in the second edition #12