tokenization Search Results

1000+ results
for tokenization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

acdh-oeaw/shawi-data #23

split utterances with several speaker

In several texts (e.g. Urfa-107_Cotton_Business) one ELAN segment contains utterances of several speakers. It would be good to separate those: * manually split ELAN segments * replace speaker init…

dasch124 updated 9 months ago
1
utterworks/fast-bert #158

AlgorithmError: Exception during training: expected string o…

@kaushaltrivedi **While training on sagemaker i'm facing this issue** - INFO - root - Writing example 0 of 6382 Exception during training: expected string or bytes-like object Traceback (m…

punit121 updated 4 years ago
1
mlfoundations/open_clip #403

Improve tokenizer decode

Right now the tokenizer decode method supports only a single instance at a time. I think it would be good to have `batch_decode` function and also support `skip_special_tokens` and `clean_up_tokenizat…

vturrisi updated 1 year ago
2
niuzaisheng/ScreenAgent #16

google.protobuf.message.DecodeError: Error parsing message

RANK=0 WORLD_SIZE=1 LOCAL_RANK=0 python cogagent_model_worker.py --host 0.0.0.0 --port 40000 --from_pretrained "saved_models/ScreenAgent-2312" --bf16 --max_length 2048 [2024-04-10 13:38:43,071] [INF…

gdnyfcuso updated 7 months ago
1
ash-jyc/db84llm #2

transcript post-processing

1. need to format debates into a better format than one block of text 2. need model / LLM to learn debate vocabulary and fix transcripts 3. delete repeats 4. make LLM "flow" -- store each argument in …

ash-jyc updated 4 months ago
1
Expensify/App #36957

[$1000] Create new library and native module react-native-wa…

To support tokenization in NewDot which is adding a virtual card to Apple / Google Pay, we will need access to some native methods. Problem: Apple Pay and Google Pay are table stakes in the card …

thienlnam updated 6 days ago
114
ComplianceAsCode/redhat-identity-management #72

IA-5(h) Protecting authenticator content from unauthorized d…

http://ssptool.securitycentral.io/certifications/FedRAMP-low/NIST-800-53/IA-5 "h. Protecting authenticator content from unauthorized disclosure and modification;" If usernames stored in database…

shawndwells updated 6 years ago
1
ftyers/docs #23

[ud] Release v1.3: pos mapping: abbr => ?

**pos:** depends... **Tokenization:** _always joint?_, i.e. keep punctuation within the token. **Dependency:** depends... currently used in rev. freq. order: ``` 28 nmod 9 name …

makazhan updated 8 years ago
3
we-like-parsers/cpython #157

f-string parser: Check comments in f-strings

In `test_fstring` the test `test_comments` fails. We need to check if is due to regular tokenization problems and the test needs updating or we need to fix something regarding comments in f-strings

pablogsal updated 1 year ago
9
dmlc/gluon-nlp #497

[BERT] BERT Tokenizer PR followups

- [ ] Add a de-tokenizer API which returns the original string representation given tokens - [x] Replace existing tokenization code in `script/bert` with the new tokenizer API

eric-haibin-lin updated 5 years ago
2

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for tokenization

1000+ results
for tokenization