-
@irina060981 Irina, I created few errors in the tokenization and the error messages have always line 9 in the text. Could you please explain what line 9 is? see sample of error messages
![Screen Sh…
-
Hi All,
I am trying to get some very basic tokenization to work. I think I am not using the API properly because the method `Tokenize` is throwing System.NullReferenceException. Any suggestions?
…
-
"8 oz chorizo" classified as "Aisles" This should likely be in Meat. This is happening because I have made "oz" a key for Aisles, and oz gets tokenized first. Even if I added chorizo to the dictionary…
-
- Search form is used for searching `Stack Template`s.
- The query in the input should be weighted by following priority list (the item on top is most important)
1) Stack template slug
2) Stack…
-
Similar to servo/servo#1009
The first step is speculative parsing concurrent with scripts, [similar to what Gecko does](https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/HTML_parser_threading).
-
data_process.py中,导入函数库时在下面这行:
**from chatglm_tokenizer.tokenization_chatglm import ChatGLMTokenizer**
出现以下错误:
======================
----> 5 from chatglm_tokenizer.tokenization_chatglm import Ch…
-
Here are a list of investigations arisen out of https://github.com/elastic/elasticsearch/pull/82870
- How should "strip_accents" in BERT style wordpiece treat umlauts and diaeresis? https://github.…
-
Hi! First of all thanks for your work in building this library!
We're just in the first steps of integrating adyen and have a working version for the web so far. We wanted to integrate this now als…
-
Hi,
I have written custom converter which converts json properties into objects specified in reloadable interface.
my json convertor
```
public class JsonConvertor implements Converter{
…
-
Right now tokenization encodes one line at a time. Ideally we should call the batched encode function on batches of data to speed things up!