large-vocabulary Search Results

1000+ results
for large-vocabulary

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

fozziethebeat/S-Space #51

Can I limit the vocabulary?

Dear developers, I did not find an option to limit the vocabulary. For example, I don't want to learn representations for words which occurs less than 50 in my corpus. The reason is that if I use all…

jiangfeng1124 updated 10 years ago
2
Sinar/sinarngo #26

A better definitive list for Resource Types

The current list of Resource Types https://github.com/Sinar/sinar.resource/blob/main/src/sinar/resource/vocabularies/resource_type.py is very general, missing some commonly used types and often not sp…

kaerumy updated 1 year ago
4
pytorch/pytorch #27540

CTCLoss cuda backend large batch handling takes up to 1.8x m…

## 🐛 Bug The special large batch / alphabet handling, although can provide up to 2.5x speedup sometimes, it comes at the cost of up to 1.8x more memory. For large targets, this can be a significant…

ASDen updated 5 years ago
1
OHDSI/CommonDataModel #465

New NLP related tables (proposal)

There are huge amounts of data being generated at hospitals every day. Up to 80% of this data is collected in an unstructured format and a large portion of it as free text. In order to extract value f…

alabarga updated 2 weeks ago
8
wellcometrust/reach #117

Fuzzy matcher is not performing when few publications are pr…

This is partly a bug and partly a feature and it was discovered when I ran the tool through a subset of Gates publications from DCP, in particular only 4 publications. The way the fuzzy matcher wor…

nsorros updated 4 years ago
1
tensorflow/recommenders-addons #450

item embedding and user sequence item embedding can not use …

**System information** - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): mac m1 - TensorFlow version and how it was installed (source or binary): binary - TensorFlow-Recommenders-Addons ve…

uzhy1987 updated 1 week ago
1
dmlc/gluon-nlp #1309

[Numpy][Pretrained Model] Add functionality to compress the …

When applying pretrained models on real datasets, we often need to adapt the tokenizer and ensure that we can appropriately transfer the knowledge: - Case 1: Trim the vocabulary For example, …

sxjscience updated 4 years ago
4
scriptotek/emnesok #105

Simplifying the text in the different Emnesøk screens

The current text, while constructive and helpful, is slightly long on the different pages, especially when just entering Emnesøk. Also, not all concepts may be clear to a user unfamiliar with subject …

hugolio updated 7 years ago
1
karpathy/nanoGPT #422

My own tokenizer

I am working on using NanoGPT to solve a geometry problem. I would like to use the gpt2 network structure but my own tokenizer. My vocabulary size is 1500. I have my own encode/decode code to convert …

spcrobocar updated 10 months ago
1
deeplearning4j/deeplearning4j #7133

NLP: Add WordPiece/SentencePiece tokenizer/detokenizer, trai…

What: WordPiece is an unsupervised multi-lingual text tokenizer. It is used in models such as BERT, though can in principle be used for many NLP models. It produces a user-specified fixed-size vocabul…

AlexDBlack updated 5 years ago
5

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for large-vocabulary

1000+ results
for large-vocabulary