-
I don't know why occur this problem, thank you for reading
OSError: Model name '../pretrained_models/RoBERTa-zh-Large/' was not found in tokenizers model name list (roberta-base, roberta-large, rob…
-
The complex accreditation and recognition rules for educational credentials and qualifications, which vary across member states, have necessitated the development of equally sophisticated data models,…
-
## Problem
The current documentation page of the vocabulary design system lacks a well-structured table of contents to enable users to jump to sections they need directly without having to scroll thr…
-
The [OData core vocabulary](http://docs.oasis-open.org/odata/odata/v4.0/os/vocabularies/Org.OData.Core.V1.xml) uses Org.OData as the root of the vocabulary namespace. We should determine a root namesp…
-
Hello, thank you for providing the community with a great framework. I did some experiments with the T5-small model on the C4 Vietnamese dataset, and I would like to get some feedback from you:
1. …
-
I got the error message below when running sh 15_pretrain_full.sh.
> OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebo…
-
There was a request to provide autosuggestion of BIC subject categories codes in OMP 3.3. and to use them in ONIX export.
The ONIX export [expects the BIC subject categories](https://github.com/pkp/o…
-
_This issue complements the discussion in [issue #1961](https://github.com/orgs/oasis-tcs/projects/5?pane=issue&itemId=69511727&issue=oasis-tcs%7Codata-specs%7C1961) with a proposal to extend service …
-
Since SentencePiece does not split unknown token by default which resulting larger vocabulary in tokenized corpora. Also, the Tokenizer seems not work very well with spacer, more segmentation made by …
-
The dataset types in the gEAR portal are nice and tidy:
```
mysql> select dtype, count(dtype) from dataset group by dtype;
+-----------------------+--------------+
| dtype | coun…