tokenization Search Results

1000+ results
for tokenization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

irlab-sdu/fuzi.mingcha #11

AttributeError: 'ChatGLMTokenizer' object has no attribute '…

我在执行cli_demo.py时，报错找不到属性 (base) root@hzhb:/data/fuzi.mingcha-main/src# python3 cli_demo.py --url_lucene_task1 "法条检索对应部署的 pylucene 地址" --url_lucene_task2 "类案检索对应部署的 pylucene 地址" 正在加载模型 Traceback (m…

OneBigMonster updated 3 months ago
3
nschneid/activedop #83

in input sentence, punctuation immediately after gap gets lo…

e.g. `word _. .` gets translated to CGEL without the final period

nschneid updated 2 weeks ago
3
projectEndings/staticSearch #319

We should not tokenize on connector punctuation

The Unicode category of Connector Punctuation (https://www.unicode.org/charts/script/chart_Punctuation-Connector.html), which is a small collection of punctuation-like symbols which are used as connec…

martindholmes updated 2 weeks ago
3
dvlab-research/MGM #83

Finetune

Hi,i finetune MGM-2B on coco, but i got the warning that: `{'loss': 6.9221, 'grad_norm': tensor(18.7422, device='cuda:0', dtype=torch.float64), 'learning_rate': 9.203084832904885e-06, 'epoch': 0.01}…

ZhangScream updated 6 months ago
7
Zeyi-Lin/LLM-Finetune #4

cuda驱动版本太低

文件"/ft/train.py"，第168行： `response = predict(messages, model, tokenizer)` 文件"/ft/train.py"，第70行： `model_inputs = tokenizer([text]， return_tensors="pt").to(device)` 文件"/root/miniconda3/lib/python3.8…

hellocrystal updated 1 month ago
1
oss-slu/Enhancing-Bioinformatics-Research-through-LLM #3

Develop a simple data preprocessing pipeline for a specific …

Create a basic data preprocessing pipeline for a specific bioinformatics dataset to prepare it for LLM training. The pipeline should include steps for data cleaning, tokenization, and formatting

AjithAkuthota23 updated 1 month ago
1
sourcegraph/sourcegraph-public-snapshot #61729

Investigate Tokenization for Claude 3 models

- Investigate whether claude 3 models need a new tokenization method or can we use the old methods for abuse detection - Collect Data from the experimentation and share results to make the decisions.

arafatkatze updated 6 months ago
2
UniversalDependencies/UD_Portuguese-Bosque #46

PARTicules and tokenization

In #38 @livyreal said that some PART are not correctly tokenized/lemmatized. Let us try a different approach... The following pages define the PART POS tag (in general and for English). - http://un…

arademaker updated 2 years ago
14
pranavilingamallu/clearnlp #2

Potentially incorrect tokenization

``` When writing a tokenization unit test for the ClearTK wrappers for ClearNLP, I found an inconsistency between OpenNLP's tokenization and ClearNLP's. Consider the string: String s = "\"John & Mar…

GoogleCodeExporter updated 9 years ago
3
peter6888/nlp_project #6

Tokenization and pointers

- [x] Implement token generation - [ ] Implement pointer mechanism - [ ] Error analysis - [ ] Reinforcement learning implementation

apurvapancholi updated 6 years ago
5

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for tokenization

1000+ results
for tokenization