-
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1463, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.8/…
-
I'm hoping that we can get to the point where we fully support the following languages.
- English
- Spanish
- German
- French
- Russian
- Japanese
- Hindi
- Farsi
- Chinese
- Arabic
I s…
-
In v3, trailing whitespace after \page or \column prevents those commands from working. I think this could cause a small bit of unnecessary confusion.
-
Refs https://github.com/scaife-viewer/backend/blob/35f792914d04152cecce7426a061a9824ae5c45c/core/scaife_viewer/core/indexer.py#L140
New URNs means these will fail:
- https://scaife-dev.perseus.org…
-
Revisiting an old issue here: should `12 div-3` parse?
Under the new 4.0 tokenization rules, it certainly doesn't.
But under Michael Dyck's interpretation of the 3.1 rules, it does parse; and ac…
-
As we extend deduplication to a wide range of languages, what tokenization method to use will have an impact on the final results.
The current script uses a simple regex and uni-gram to perform min…
-
When running the code, the following error might be encountered:
```
File "HKU-DASC7606-A2\tokenization_codegen.py", line 203, in get_vocab
return dict(self.encoder, **self.added_tokens_encoder)
A…
-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
### Describe the bug
(lmdeploy042) yuzailiang@ubun…
-
### Resource Type
_No response_
### Describe the problem or limitation you are having
4.x just added binary tokenization back:
https://github.com/godotengine/godot/pull/87634
### Describe the fea…
-
First, thanks for your excellent work. Here is my question:
- I used your code to reproduce the results in your paper, but found the CPU utilization rate was really high during training process, espe…