Open ElongHu opened 1 month ago
Thank you for this feedback. Here is the explanation and the solution:
Explanation a) the libraries, modules, and packages continually change, which might impact the code b) AI NLP algorithms are stochastic, meaning that there is randomness in the outputs
Solution: Run the more recent version which is Tokenizers.ipynb in the Transformers for NLP and CV, 3rd Edition repository: https://colab.research.google.com/github/Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition/blob/main/Chapter10/Tokenizers.ipynb
It is open-source, thus free, and also it is self-contained.
In the Tokenizer.ipynb of Chapter 9 in the second edition, all the similarity calculations are inconsistent with those in the book, and in some cases, the conclusions in the similarity representation are completely opposite.