Open atomiechen opened 5 months ago
delete all *model_revision, and try it again. All requirements would be installed automatically.
Yes, thank you. But basically what I want to do is to build an image with installed packages ahead of running any scripts. I believe I should not figure it out through trial and error by myself.
Yes, thank you. But basically what I want to do is to build an image with installed packages ahead of running any scripts. I believe I should not figure it out through trial and error by myself.
If there exists any errors, please let me know after you delete all *model_revision.
If there exists any errors, please let me know after you delete all *model_revision.
Sadly yes.
I removed all *model_revision:
from funasr import AutoModel
model = AutoModel(
model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch",
punc_model="iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch",
spk_model="iic/speech_campplus_sv_zh-cn_16k-common",
)
And I still got:
ckpt: iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
ckpt: iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
ckpt: iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/model.pt
ckpt: iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
Traceback (most recent call last):
File "/shared/test-funasr/tmp_test.py", line 10, in <module>
spk_model="iic/speech_campplus_sv_zh-cn_16k-common",
File "/home/user/.local/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 135, in __init__
self.cb_model = ClusterBackend().to(kwargs["device"])
File "/home/user/.local/lib/python3.10/site-packages/funasr/models/campplus/cluster_backend.py", line 149, in __init__
self.umap_hdbscan_cluster = UmapHdbscan()
File "/home/user/.local/lib/python3.10/site-packages/funasr/models/campplus/cluster_backend.py", line 118, in __init__
import hdbscan
ModuleNotFoundError: No module named 'hdbscan'
FunASR Version: 1.0.19
And I cannot even import funasr using the latest commit (702b9b540c3c1524748cd975a10ce33f0fa53912) on main branch:
>>> import funasr
/.../FunASR/funasr/datasets/large_datasets/utils/tokenize.py:93: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if vad is not -2:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../FunASR/funasr/__init__.py", line 36, in <module>
import_submodules(__name__)
File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules
results.update(import_submodules(name))
File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules
results.update(import_submodules(name))
File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules
results.update(import_submodules(name))
File "/.../FunASR/funasr/__init__.py", line 25, in import_submodules
for loader, name, is_pkg in pkgutil.walk_packages(package.__path__, package.__name__ + '.'):
AttributeError: 'str' object has no attribute '__path__'. Did you mean: '__hash__'?
Plus: all my models are already there inside the literally iic
folder in current directory, so there is no extra downloads. The environment running above script does not have modelscope
installed.
Still worth mentioning: during the image building phase one should not use a test script like this to 'trigger' the auto installation of extra dependencies, which is anti-pattern. It needs explicit commands to prepare the environment, like pip install funasr[spk]
.
If there exists any errors, please let me know after you delete all *model_revision.
Sadly yes.
I removed all *model_revision:
from funasr import AutoModel model = AutoModel( model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch", vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch", punc_model="iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch", spk_model="iic/speech_campplus_sv_zh-cn_16k-common", )
And I still got:
ckpt: iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt ckpt: iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt ckpt: iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/model.pt ckpt: iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin Traceback (most recent call last): File "/shared/test-funasr/tmp_test.py", line 10, in <module> spk_model="iic/speech_campplus_sv_zh-cn_16k-common", File "/home/user/.local/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 135, in __init__ self.cb_model = ClusterBackend().to(kwargs["device"]) File "/home/user/.local/lib/python3.10/site-packages/funasr/models/campplus/cluster_backend.py", line 149, in __init__ self.umap_hdbscan_cluster = UmapHdbscan() File "/home/user/.local/lib/python3.10/site-packages/funasr/models/campplus/cluster_backend.py", line 118, in __init__ import hdbscan ModuleNotFoundError: No module named 'hdbscan'
FunASR Version: 1.0.19
And I cannot even import funasr using the latest commit (702b9b5) on main branch:
>>> import funasr /.../FunASR/funasr/datasets/large_datasets/utils/tokenize.py:93: SyntaxWarning: "is not" with a literal. Did you mean "!="? if vad is not -2: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../FunASR/funasr/__init__.py", line 36, in <module> import_submodules(__name__) File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules results.update(import_submodules(name)) File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules results.update(import_submodules(name)) File "/.../FunASR/funasr/__init__.py", line 33, in import_submodules results.update(import_submodules(name)) File "/.../FunASR/funasr/__init__.py", line 25, in import_submodules for loader, name, is_pkg in pkgutil.walk_packages(package.__path__, package.__name__ + '.'): AttributeError: 'str' object has no attribute '__path__'. Did you mean: '__hash__'?
FunASR Version: 1.0.19
You should pip install -e .
I mean I tried both ways:
pip install funasr
to install the latest pypi version (1.0.19)pip install -e .
after pulling the latest commit of main branch, which results in above error.I mean I tried both ways:
pip install funasr
to install the latest pypi version (1.0.19)pip install -e .
after pulling the latest commit of main branch, which results in above error.
先 pip install -e . 然后把这里注释解除,把报错log出来:https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/__init__.py#L21
I mean I tried both ways:
pip install funasr
to install the latest pypi version (1.0.19)pip install -e .
after pulling the latest commit of main branch, which results in above error.
Bug has been fixed. Please update funasr https://github.com/alibaba-damo-academy/FunASR/pull/1580 :
pip pull
pip install -e .
I pulled latest commit, used pip install -e .
and uncommnet the print (see screenshot), but found still the same output:
So there is no error reported here.
Requirements would be installed in https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/download/download_from_hub.py#L76
Maybe you could debug it and show the log.
Plus: all my models are already there inside the literally
iic
folder in current directory, so there is no extra downloads. The environment running above script does not havemodelscope
installed.
The problem is that models of previous revision (instead of master) is already downloaded in the iic
folder, and the code does not check that and will not redownload the latest master revision. So there is no requirements.txt
file in the campplus model folder.
I now understand that the requirements.txt comes from the model dir. Maybe some mechanism of auto redownloading the specified revision is required?
❓ And also I wonder if this is possible:
2. There is a
sklearn.cluster.HDBSCAN
, and I findsklearn
is already there withfunasr
installed. Can we just use that sklearn one instead of installing the standalone versionhdbscan
?
❓ Questions and Help
What is your question?
Starting from a fresh container environment equipped with pytorch and funasr (via
pip install funasr
), I encounteredModuleNotFoundError: No module named 'hdbscan'
when I instanciate an AutoModel with a spk model. It originates from theimport hdbscan
inUmapHdbscan()
<-ClusterBackend()
<-AutoModel(...)
.Must I install
hdbscan
manually? Is there any other package that I also need in advance?There is a
sklearn.cluster.HDBSCAN
, and I findsklearn
is already there withfunasr
installed. Can we just use that sklearn one instead of installing the standalone versionhdbscan
?Code
What have you tried?
In a pytorch docker container, run
pip install funasr
and then the script above.What's your environment?
pip
, source): pip