shibing624 text2vec issues

shibing624 / text2vec

text2vec, text to vector. 文本向量表征工具，把文本转化为向量矩阵，实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型，开箱即用。

https://pypi.org/project/text2vec/

Apache License 2.0

4.39k stars 392 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

对称语义还是非对称语义？

#103 yufengzhe1 closed 1 year ago
2
"闵可夫斯基距离"错误

#102 marlo-Li closed 1 year ago
1
text2vec-base-chinese-sentence cpu推理速度慢正常吗

#101 Dengyingjie closed 1 year ago
2
添加Bfloat16和多卡DataParallel训练功能

#100 wptoux closed 1 year ago
1
您好！我想请问一下关于数据集的构建

#99 programmeguru closed 1 year ago
1
文件的分割，text2vec-base-chinese支持最大输入tokens是128

#98 stoneLee81 closed 1 year ago
2
训练数据正负样本的比例

#97 baisuzi closed 1 year ago
1
text2vec-base-chinese模型最大支持多少输入tokens

#96 stoneLee81 closed 1 year ago
2
请问有没有抽取关键词的模型

#95 Godlikemandyy closed 1 year ago
1
我按照100个batch进行一次eval_model，发现和原来的结果不同

#94 programmeguru closed 1 year ago
5
AttributeError: ''NoneType object has no attribute 'squeeze'

#93 sz2three closed 1 year ago
5
Question about training sample

#92 callanwu closed 1 year ago
4
请问无监督的训练数据是如何构建的呢 examples/training_unsup_text_matching_model_en.py

#91 sz2three closed 1 year ago
1
m芯片使用cpu的模式也会出现Torch not compiled with CUDA enabled’的错误

#90 stoneLee81 closed 1 year ago
2
有跟openAI ada embedding做比较吗

#89 CopyNinja1999 closed 1 year ago
1
为什么训练集不能shuffle呢，shuffle之后效果下降很多

#88 programmeguru closed 1 year ago
3
为什么不对 embedding 做归一化呢？

#87 thsno02 closed 1 year ago
2
text2vec-base-chinese-paraphrase模型相似度计算问题

#86 zaczou closed 1 year ago
1
w2v-light-tencent-chinese对文本是否有长度限制呢？

#85 graciechen closed 1 year ago
1
能用新闻分类的数据集做训练微调吗

#84 TanXiang7o closed 11 months ago
4
为什么我没有找到SOTA的指标，release model表格里没有比中文表格里指标大的啊

#83 TanXiang7o closed 1 year ago
1
semantic_search结果变化

#82 xxllp closed 1 year ago
0
text2vec仅仅是预训练模型将文本表示出向量吗？

#81 1264561652 closed 1 year ago
3
支持日文吗？

#80 sz2three closed 1 year ago
2
有没有可能用较好7B中文预训练模型来contrasive train出一个s2p或p2p的embedding模型？

#79 firezym closed 1 year ago
2
在真实场景中，效果最好的还是text2vec-base-chinese

#78 danger-dream closed 1 year ago
6
请问怎么才能用GPU来推理，我采用.cuda 发现 SentenceModel 没有cuda 这个属性

#77 wlhonce closed 1 year ago
1
无法稳定使用cuda加速推理

#76 livehl closed 1 year ago
4
请问下如果试用 faiss保存这个里面的耗时的embedding应该怎么写，能给个示例吗

#75 xx-zhang closed 1 year ago
0
should i articiple while embedding chinese

#74 abusizhishen closed 10 months ago
1
请问如何导出完整的模型

#73 MingFL closed 1 year ago
5
cos 值的疑惑

#72 hanswang73 closed 1 year ago
8
运行 api server： python jina_server_demo.py报错 ModuleNotFoundError

#71 alexhmyang closed 11 months ago
2
label能否支持多档位呢[Feature] <title>

#70 MingFL closed 1 year ago
2
关于测评结果

#69 1264561652 opened 1 year ago
5
中文匹配数据集的评测结果中的模型可以在哪里下载呢？

#68 DSXiangLi closed 1 year ago
2
关于模型评估

#67 AceCoder0 opened 1 year ago
2
是否可以商用？

#66 bh4ffu closed 1 year ago
1
sbert英文stsb数据集上10个epoch皮尔森系数只有0.63

#65 LemonMi closed 1 year ago
4
使用您的 shibing624/text2vec-base-chinese 模型，输出的词嵌入是768维的，能降低维度吗？比如128维

#64 JonGates closed 1 year ago
1
目前哪种句子向量与篇章向量比较sota

#63 yuanjie-ai closed 1 year ago
3
服务器不能下载模型文件，请问可以手动上传那几个模型文件吗，上传到哪里呢

#62 zhaoyiCC closed 1 year ago
2
能在更多样本训练更准的模型吗

#61 alexw994 closed 1 year ago
10
调用m = CosentModel("bert-base-chinese") 试总是killed

#60 zhaoyiCC closed 1 year ago
4
自有数据集构建的疑问

#59 Gladiator566 closed 1 year ago
5
相关paper或基准

#58 1264561652 closed 1 year ago
2
what if the sentence is longer than BERT default 512 ?

#57 RedBlack888 closed 1 year ago
9
是否支持模型加速

#56 flydsc closed 1 year ago
4
semantic_search速度的一点疑问

#55 xxllp closed 1 year ago
6
长文本的相似度

#54 xxllp closed 1 year ago
5

Previous Next