-
JSON-LD models and metamodels now doesn't include metadata like version or concept descriptions. Such info should also be included, so that automated generation of documentation is possible.
Metada…
-
Hey @Tyr30 thanks for a good discussion earlier today! As mentioned, some minor thoughts on things to add:
- [ ] (Event Core) verbatimLocality -- this is where I would record the site_name (e.g., N…
-
**why subtract self.vocab_start_index**
```
def forward(self, input_):
assert not torch.any(
(input_ < 0) | (input_ >= self.num_embeddings)
), "An input token is o…
-
### 🚀 The feature
Support of multiple languages (accordingly VOCABS["multilingual"]) by pretrained models.
### Motivation, pitch
It would be great to use models which supports multiple languages b…
-
This is one half of the original issue #10.
Both Webwarp and WebAlign have been registered and published on the Knowledge Graph:
Webwarp (0.7): https://search.kg.ebrains.eu/instances/06358a83-5bf0…
-
thanks for providing this toolset. i am unsure how utils.extend_model_vocab is intended to be used. in current form, it takes checkpoints only. when i try to adapt it to extend a model in safetensor f…
-
Message='BaiChuanTokenizer' object has no attribute 'sp_model'
Source=C:\Users\Administrator\.cache\huggingface\modules\transformers_modules\Sunsimiao\tokenization_baichuan.py
StackTrace:
…
-
no data about vocab.txt
-
we should improve delete so that it only removes the vocabulary when it's completely removed
1. delete of a vocab should create a task for the deletion
2. the status of the vocab should be changed…
nvdk updated
4 months ago
-
New repo. Just scrape the words from every dict and make a list for each kanji. Maybe add features to the is-hanzi module (or create a new module using it to parse characters.
Also maybe time to cr…