-
### System Info
mac chrome, webgpu support
### Environment/Platform
- [X] Website/web-app
- [ ] Browser extension
- [ ] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- […
-
佬你好,请问能提供一下gte召回方案的代码吗?
-
Not sure if this is intentional or not:
```
var tokenizer = new natural.WordPunctTokenizer();
console.log(tokenizer.tokenize("Example sentence (with parenthetical expression)."));
```
outputs:
```
…
-
### Issue type
Bug
### Have you reproduced the bug with TensorFlow Nightly?
Yes
### Source
source
### TensorFlow version
tensorflow==2.15.0.post1
### Custom code
Yes
### OS platform and dist…
-
## Weird sentence splitting
I am currently using this summarizer for German text but I have been getting some issues with sentences being split at abbreviations. To list an example, I have the sent…
-
## 🐛 Bug
When permuting sentences in the [denoising dataset](https://github.com/facebookresearch/fairseq/blob/a6a63279422f846a3c2f6c45b9c96d6951cc4b82/fairseq/data/denoising_dataset.py#L225), the p…
-
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in
93 TOKENIZER, MODEL = load_…
-
It's becoming the norm to have prompt prefixes for text embedding models. I think we should add this to the [hf-embedder](https://docs.vespa.ai/en/reference/embedding-reference.html#huggingface-embedd…
-
Hey, super useful tool!
There's been some development in the chunking community. If you'd like to keep your app up to date here are a few suggestions. Also, considerung that all of the options str…
do-me updated
1 month ago
-
`word_tokenize` keeps the opening single quotes and doesn't pad it with space, this is to make sure that the clitics get tokenized as `'ll`, `'ve', etc.
The original treebank tokenizer has the sam…