-
知乎每日精选 2023-01-13
-
知乎每日精选 2023-01-12
-
知乎每日精选 2023-01-11
-
I want to train the Chinese model. Do you support mixed input in Chinese and English?
-
Assuming I have downloaded the `box` folder with the `data`, `tokenizers`, and `trained_model` folders, could you please provide an example of how to run evaluation on `box/data/qed/chembl_selfies_eva…
-
Is there any way to compute surprisal for Chinese sentences? Right now, the Chinese characters are processed in a weird way and the output does not match the number of Chinese characters in the input.…
-
Was googling `"pytorch" "adaptive_max_pool2d"`. Found this:
```
torch.nn — PyTorch master documentation
pytorch.org/docs/master/nn.html
... max_pool2d; max_pool3d; max_unpool1d; max_unpool2d; max_…
-
OSError: Can't load tokenizer for 'bert-base-chinese'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, mak…
-
### System Info
os: mac-arm
node: v18.18.2
electron: "28.0.0",
electron-vite: "^1.0.27",
### Environment/Platform
- [ ] Website/web-app
- [ ] Browser extension
- [X] Server-side (e.g., N…
-
1.我准备的文件如图示,打算用\n\n进行分割,段和段直接都用双空行进行了分割。
2.采用ali_text_splitter的分词器,我看里面可以直接双空行分割。代码如下
def split_text(self, text: str) -> List[str]:
# use_document_segmentation参数指定是否用语义切分文档,此处采取的文档语义分割模…