Open rdyuan opened 6 months ago
/assign @wxywb can you help on investigating it
This code works in my environment. It may be related to some multiprocessing problems I need to delve into. You can try the following code.
from milvus_model.sparse.bm25.tokenizers import build_default_analyzer
from milvus_model.sparse import BM25EmbeddingFunction
analyzer = build_default_analyzer(language="zh")
corpus = [ "人工智能于1956年作为一门学科成立。", "艾伦·图灵是第一个对人工智能进行实质性研究的人。", "图灵出生在伦敦的梅达维尔,在英格兰南部长大。", ]
# this line will remove multi-processing
bm25_ef = BM25EmbeddingFunction(analyzer, num_workers=1)
bm25_ef.fit(corpus)
docs = [ "人工智能领域于1956年作为一门学术学科成立。", "艾伦·图灵是在人工智能领域进行重大研究的先驱。", "图灵出生在伦敦的梅达维尔,在英格兰南部地区长大。", "1956年,人工智能作为一个学术领域出现。", "图>灵来自伦敦梅达维尔,在英格兰南部长大。" ]
docs_embeddings = bm25_ef.encode_documents(docs)
print("Embeddings:", docs_embeddings)
print("Sparse dim:", bm25_ef.dim, list(docs_embeddings)[0].shape)
@rdyuan Could you give me full trace log? It seems just part of it.
@rdyuan Could you give me full trace log? It seems just part of it.
@rdyuan Could you give me full trace log? It seems just part of it.
加了num_workers=1确实跑通了
这个问题还没解决吗?一到fit就开始死循环, num_workers=1是可以的
这个问题还没解决吗?一到fit就开始死循环, num_workers=1是可以的
what operating system are you using?and please show me the code snippet abd error info.
这个问题还没解决吗?一到fit就开始死循环, num_workers=1是可以的
what operating system are you using?and please show me the code snippet abd error info.
just as the same problem as this issue. and os is Mac with Intel chip
这个问题还没解决吗?一到fit就开始死循环, num_workers=1是可以的
what operating system are you using?and please show me the code snippet abd error info.
just as the same problem as this issue. and os is Mac with Intel chip
how about your python version?
这个问题还没解决吗?一到fit就开始死循环, num_workers=1是可以的
what operating system are you using?and please show me the code snippet abd error info.
just as the same problem as this issue. and os is Mac with Intel chip
how about your python version?
3.12
这是我的全部代码:
from milvus_model.sparse.bm25.tokenizers import build_default_analyzer from milvus_model.sparse import BM25EmbeddingFunction analyzer = build_default_analyzer(language="zh") corpus = [ "人工智能于1956年作为一门学科成立。", "艾伦·图灵是第一个对人工智能进行实质性研究的人。", "图灵出生在伦敦的梅达维尔,在英格兰南部长大。", ] bm25_ef = BM25EmbeddingFunction(analyzer) bm25_ef.fit(corpus) docs = [ "人工智能领域于1956年作为一门学术学科成立。", "艾伦·图灵是在人工智能领域进行重大研究的先驱。", "图灵出生在伦敦的梅达维尔,在英格兰南部地区长大。", "1956年,人工智能作为一个学术领域出现。", "图灵来自伦敦梅达维尔,在英格兰南部长大。" ] docs_embeddings = bm25_ef.encode_documents(docs) print("Embeddings:", docs_embeddings) print("Sparse dim:", bm25_ef.dim, list(docs_embeddings)[0].shape)
在执行到bm25_ef.fit(corpus)时发生报错如下:
Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 129, in _main main_content = runpy.run_path(main_path, main_content = runpy.run_path(main_path, ^^ ^prepare(preparation_data)^ ^^^^^^ ^ ^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 240, in prepare ^^^^^^^^^^^^^^^^^^^^^^^^^ File "<frozen runpy>", line 291, in run_path File "<frozen runpy>", line 98, in _run_module_code File "<frozen runpy>", line 88, in _run_code
相关版本号: Python==3.11.3 milvus_model==0.2.2