dataiku / dataiku-contrib

Public repository for DSS plugins
Apache License 2.0
101 stars 82 forks source link

[Sentence Embeddings] Error when running #178

Closed ghost closed 4 years ago

ghost commented 4 years ago

Hello,

Can you please help me with the below error code:

"Job failed: Error in Python process: At line 5: <type 'exceptions.ImportError'>: cannot import name open"

[2020/01/07-11:43:28.764] [null-err-72] [INFO] [dku.utils] - *** Recipe code failed ** [2020/01/07-11:43:28.765] [null-err-72] [INFO] [dku.utils] - Begin Python stack [2020/01/07-11:43:28.766] [null-err-72] [INFO] [dku.utils] - Traceback (most recent call last): [2020/01/07-11:43:28.766] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_glove1_2020-01-07T09-43-27.559/compute_glove1_NP/custom-python-recipe/pyout4PDNSy09fJPU/python-exec-wrapper.py", line 194, in [2020/01/07-11:43:28.767] [null-err-72] [INFO] [dku.utils] - exec(f.read()) [2020/01/07-11:43:28.767] [null-err-72] [INFO] [dku.utils] - File "", line 5, in [2020/01/07-11:43:28.768] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/plugins/installed/sentence-embedding/python-lib/commons.py", line 4, in [2020/01/07-11:43:28.768] [null-err-72] [INFO] [dku.utils] - from dku_language_model.context_independent_language_model import FasttextModel, Word2vecModel, GloveModel, CustomModel [2020/01/07-11:43:28.769] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/plugins/installed/sentence-embedding/python-lib/dku_language_model/init.py", line 1, in [2020/01/07-11:43:28.769] [null-err-72] [INFO] [dku.utils] - from dku_language_model.context_independent_language_model import FasttextModel, Word2vecModel, GloveModel [2020/01/07-11:43:28.770] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/plugins/installed/sentence-embedding/python-lib/dku_language_model/context_independent_language_model.py", line 4, in [2020/01/07-11:43:28.770] [null-err-72] [INFO] [dku.utils] - from gensim.models import KeyedVectors [2020/01/07-11:43:28.771] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages/gensim/init.py", line 5, in [2020/01/07-11:43:28.771] [null-err-72] [INFO] [dku.utils] - from gensim import parsing, corpora, matutils, interfaces, models, similarities, summarization, utils # noqa:F401 [2020/01/07-11:43:28.772] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages/gensim/parsing/init.py", line 4, in [2020/01/07-11:43:28.772] [null-err-72] [INFO] [dku.utils] - from .preprocessing import (remove_stopwords, strip_punctuation, strip_punctuation2, # noqa:F401 [2020/01/07-11:43:28.773] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages/gensim/parsing/preprocessing.py", line 42, in [2020/01/07-11:43:28.773] [null-err-72] [INFO] [dku.utils] - from gensim import utils [2020/01/07-11:43:28.774] [null-err-72] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages/gensim/utils.py", line 45, in [2020/01/07-11:43:28.774] [null-err-72] [INFO] [dku.utils] - from smart_open import open [2020/01/07-11:43:28.775] [null-err-72] [INFO] [dku.utils] - ImportError: cannot import name open [2020/01/

Thanks

du-phan commented 4 years ago

Hi there, It seems that you have a problem with the version of smart_open package (which is installed by default along with the gensim pacakage). Can you go to Administration -> Code envs -> plugin_sentence-embedding_managed -> Installed packages and give me the list of all the packages and their version ? Thank you!

ghost commented 4 years ago

Hi there,

Do you have any idea about the below error? I have downloaded the embeddings using the macro.

Job failed: Error in Python process: At line 57: <type 'exceptions.ValueError'>: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors.

du-phan commented 4 years ago

Do you have the folder containing the pretrained model as input to the recipe ? Can you send me the whole log pls ?

ghost commented 4 years ago

Please see below.

Many Thanks!

16:41:15] [INFO] [dku] running compute_hrgwehur_NP - ---------------------------------------- [16:41:15] [INFO] [dku] running compute_hrgwehur_NP - DSS startup: jek version:6.0.1 [16:41:15] [INFO] [dku] running compute_hrgwehur_NP - DSS home: /Users/bob/Library/DataScienceStudio/dss_home [16:41:15] [INFO] [dku] running compute_hrgwehur_NP - OS: Mac OS X 10.15.2 x86_64 - Java: Oracle Corporation 1.8.0_221 [16:41:15] [INFO] [dku.flow.jobrunner] running compute_hrgwehur_NP - Allocated a slot for this activity! [16:41:15] [INFO] [dku.flow.jobrunner] running compute_hrgwehur_NP - Run activity [16:41:15] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Executing default pre-activity lifecycle hook [16:41:15] [INFO] [dku.managedfolders.handler] running compute_hrgwehur_NP - Create provider for TWITTER.ZQ1worch with path /TWITTER [16:41:15] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Checking if sources are ready [16:41:15] [DEBUG] [dku.db.internal] running compute_hrgwehur_NP - Borrowing a connection. Read-only: false [16:41:15] [DEBUG] [dku.db.internal] running compute_hrgwehur_NP - Created DSSDBConnection dssdb-h2-flow_state-YI1hTFz [16:41:15] [DEBUG] [dku.dataset.hash] running compute_hrgwehur_NP - Readiness cache miss for datasetadminTWITTER.train_preparedNP [16:41:15] [INFO] [dku.datasets.file] running compute_hrgwehur_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"TWITTER/train_prepared","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [16:41:15] [INFO] [dku.datasets.ftplike] running compute_hrgwehur_NP - Enumerating Filesystem dataset prefix= [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumerating local filesystem prefix=/ [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumeration done nb_paths=1 size=1038227 [16:41:15] [INFO] [dku.dataset.hash] running compute_hrgwehur_NP - Caching readiness for datasetadminTWITTER.train_preparedNP s=READY h=yPH5l+XYPN7R/aYM5J+EJg [16:41:15] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Checked source readiness TWITTER.train_prepared -> true [16:41:15] [INFO] [dku.managedfolders.handler] running compute_hrgwehur_NP - Enumerating managed folder prefix= [16:41:15] [INFO] [dku.managedfolders.handler] running compute_hrgwehur_NP - Create provider for TWITTER.ZQ1worch with path /TWITTER [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumerating local filesystem prefix=/ [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumeration done nb_paths=1 size=5025028820 [16:41:15] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Checked source readiness TWITTER.ZQ1worch -> true [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Computing hashes to propagate BEFORE activity [16:41:15] [DEBUG] [dku.db.internal] running compute_hrgwehur_NP - Borrowing a connection. Read-only: false [16:41:15] [DEBUG] [dku.dataset.hash] running compute_hrgwehur_NP - Readiness cache miss for datasetadminTWITTER.train_preparedNP [16:41:15] [INFO] [dku.datasets.file] running compute_hrgwehur_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"TWITTER/train_prepared","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [16:41:15] [INFO] [dku.datasets.ftplike] running compute_hrgwehur_NP - Enumerating Filesystem dataset prefix= [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumerating local filesystem prefix=/ [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumeration done nb_paths=1 size=1038227 [16:41:15] [INFO] [dku.dataset.hash] running compute_hrgwehur_NP - Caching readiness for datasetadminTWITTER.train_preparedNP s=READY h=yPH5l+XYPN7R/aYM5J+EJg [16:41:15] [INFO] [dku.managedfolders.handler] running compute_hrgwehur_NP - Enumerating managed folder prefix= [16:41:15] [INFO] [dku.managedfolders.handler] running compute_hrgwehur_NP - Create provider for TWITTER.ZQ1worch with path /TWITTER [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumerating local filesystem prefix=/ [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumeration done nb_paths=1 size=5025028820 [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Recorded 2 hashes before activity run [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Building recipe runner of type [16:41:15] [DEBUG] [dku.job.activity] running compute_hrgwehur_NP - Filling source sizes [16:41:15] [INFO] [dku.datasets.file] running compute_hrgwehur_NP - Building Filesystem handler config: {"connection":"filesystem_managed","path":"TWITTER/train_prepared","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [16:41:15] [INFO] [dku.datasets.ftplike] running compute_hrgwehur_NP - Enumerating Filesystem dataset prefix= [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumerating local filesystem prefix=/ [16:41:15] [DEBUG] [dku.fs.local] running compute_hrgwehur_NP - Enumeration done nb_paths=1 size=1038227 [16:41:15] [DEBUG] [dku.job.activity] running compute_hrgwehur_NP - Done filling source sizes [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Recipe runner built, will use 1 thread(s) [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Starting execution thread: com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner@7fa21020 [16:41:15] [DEBUG] [dku.flow.activity] running compute_hrgwehur_NP - Execution threads started, waiting for activity end [16:41:15] [INFO] [dku.flow.activity] - Run thread for activity compute_hrgwehur_NP starting [16:41:15] [INFO] [dku.flow.custompython] - Dumping Python script to /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/script.py [16:41:15] [INFO] [dip.venv.selector] - Select in plugin with {"defaultPermission":{"admin":false},"permissions":[],"parameterSets":[],"config":{},"codeEnvName":"plugin_sentence-embedding_managed","presets":[],"gitConfig":{}} [16:41:15] [INFO] [dku.flow.abstract.python] - Dumping Python script to /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/script.py [16:41:15] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"TWITTER/hrgwehur","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [16:41:15] [WARN] [dku.fs.local] - File does not exist: /Users/bob/Library/DataScienceStudio/dss_home/managed_datasets/TWITTER/hrgwehur [16:41:15] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"TWITTER/train_prepared","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [16:41:15] [WARN] [dku.code.projectLibs] - External libraries file not found: /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/localconfig/projects/TWITTER/lib/external-libraries.json [16:41:15] [INFO] [dku.code.projectLibs] - EXTERNAL LIBS FROM TWITTER is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [16:41:15] [INFO] [dku.code.projectLibs] - chunkFolder is /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/localconfig/projects/TWITTER/lib/R [16:41:15] [INFO] [dip.plugin.presets] - Checking project-level settings for overriden presets and additional presets [16:41:15] [INFO] [dip.plugin.presets] - Resolve for {"aggregation_method":"simple_average","embedding_is_custom":false,"advanced_settings":false,"smoothing_parameter":0.001,"n_principal_components":1,"text_column_names":["text"]} [16:41:15] [INFO] [xxx] - RSRC PATH: ["/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/localconfig/projects/TWITTER/lib/R"] [16:41:15] [INFO] [dku.recipes.code.base] - Writing dku-exec-env for local execution in /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/remote-run-env-def.json [16:41:15] [INFO] [dku.code.envs.resolution] - Executing Python activity in env: plugin_sentence-embedding_managed [16:41:15] [INFO] [dku.flow.abstract.python] - Execute activity command: ["/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/bin/python","-u","/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/python-exec-wrapper.py","/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/script.py"] [16:41:15] [INFO] [dku.recipes.code.base] - Run command insecurely, from user bob [16:41:15] [INFO] [dku.security.process] - Starting process (regular) [16:41:15] [INFO] [dku.security.process] - Process started with pid=4791 [16:41:15] [INFO] [dku.processes.cgroups] - Will use cgroups [] [16:41:15] [INFO] [dku.processes.cgroups] - Applying rules to used cgroups: [] [16:41:15] [INFO] [dku.recipes.code.base] - Process reads from nothing [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO -------------------- [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Dataiku Python entrypoint starting up [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO executable = /Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/bin/python [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO argv = ['/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/python-exec-wrapper.py', '/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/script.py'] [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO -------------------- [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Looking for RemoteRunEnvDef in ./remote-run-env-def.json [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Found RemoteRunEnvDef environment: ./remote-run-env-def.json [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Running a DSS Python recipe locally, uinsetting env [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Setup complete, ready to execute Python code [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Sys path: ['/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA', '/Users/bob/Library/DataScienceStudio/dss_home/lib/python', '/Applications/DataScienceStudio.app/Contents/Resources/kit/python', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python27.zip', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/plat-darwin', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/plat-mac', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/plat-mac/lib-scriptpackages', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/lib-tk', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/lib-old', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/lib-dynload', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac', '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages', '/Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages', u'/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/localconfig/projects/TWITTER/lib/python', u'/Users/bob/Library/DataScienceStudio/dss_home/plugins/installed/sentence-embedding/python-lib'] [16:41:15] [INFO] [dku.utils] - 2020-01-07 16:41:15,622 INFO Script file: /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/script.py [16:41:16] [INFO] [dku.utils] - /Users/bob/Library/DataScienceStudio/dss_home/code-envs/python/plugin_sentence-embedding_managed/lib/python2.7/site-packages/scipy/sparse/sparsetools.py:21: DeprecationWarning: scipy.sparse.sparsetools is deprecated! [16:41:16] [INFO] [dku.utils] - scipy.sparse.sparsetools is a private module for scipy.sparse, and should not be used. [16:41:16] [INFO] [dku.utils] - _deprecated() [16:41:16] [INFO] [dku.utils] - 2020-01-07 16:41:16,505 INFO 'pattern' package not found; tag filters are not available for English [16:41:18] [INFO] [dku.utils] - 2020-01-07 16:41:18,533 INFO Loading word embeddings from the input folder... [16:41:18] [INFO] [dku.utils] - *** Recipe code failed ** [16:41:18] [INFO] [dku.utils] - Begin Python stack [16:41:18] [INFO] [dku.utils] - Traceback (most recent call last): [16:41:18] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/python-exec-wrapper.py", line 194, in [16:41:18] [INFO] [dku.utils] - exec(f.read()) [16:41:18] [INFO] [dku.utils] - File "", line 57, in [16:41:18] [INFO] [dku.utils] - File "/Users/bob/Library/DataScienceStudio/dss_home/plugins/installed/sentence-embedding/python-lib/commons.py", line 49, in load_pretrained_model [16:41:18] [INFO] [dku.utils] - "or tick the custom embedding box if you are using custom vectors.") [16:41:18] [INFO] [dku.utils] - ValueError: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors. [16:41:18] [INFO] [dku.utils] - End Python stack [16:41:18] [INFO] [dku.utils] - 2020-01-07 16:41:18,534 INFO Check if spark is available [16:41:18] [INFO] [dku.utils] - 2020-01-07 16:41:18,535 INFO Not stopping a spark context: No module named pyspark.context [16:41:18] [INFO] [dku.recipes.code.base] - Error file found, trying to throw it: /Users/bob/Library/DataScienceStudio/dss_home/jobs/TWITTER/Build_hrgwehur_2020-01-07T14-41-14.950/compute_hrgwehur_NP/custom-python-recipe/pyoutVply6rdrrEVA/error.json [16:41:18] [INFO] [dku.recipes.code.base] - Raw error is{"errorType":"\u003ctype \u0027exceptions.ValueError\u0027\u003e","message":"Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors.","detailedMessage":"At line 57: \u003ctype \u0027exceptions.ValueError\u0027\u003e: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors.","stackTrace":[]} [16:41:18] [INFO] [dku.recipes.code.base] - Now err: {"errorType":"\u003ctype \u0027exceptions.ValueError\u0027\u003e","message":"Error in Python process: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors.","detailedMessage":"Error in Python process: At line 57: \u003ctype \u0027exceptions.ValueError\u0027\u003e: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors.","stackTrace":[]} [16:41:18] [INFO] [dku.flow.activity] - Run thread failed for activity compute_hrgwehur_NP com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 57: <type 'exceptions.ValueError'>: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors. at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:166) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:102) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48) at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:380) [16:41:18] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - activity is finished [16:41:18] [ERROR] [dku.flow.activity] running compute_hrgwehur_NP - Activity failed com.dataiku.common.server.APIError$SerializedErrorException: Error in Python process: At line 57: <type 'exceptions.ValueError'>: Something is wrong with the pre-trained embeddings. Please make sure to either use the plugin macro to download the embeddings, or tick the custom embedding box if you are using custom vectors. at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleErrorFile(AbstractCodeBasedActivityRunner.java:186) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:166) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:102) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:48) at com.dataiku.dip.recipes.customcode.CustomPythonRecipeRunner.run(CustomPythonRecipeRunner.java:71) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:380) [16:41:18] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Executing default post-activity lifecycle hook [16:41:18] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Removing samples for TWITTER.hrgwehur [16:41:18] [INFO] [dku.flow.activity] running compute_hrgwehur_NP - Done post-activity tasks

du-phan commented 4 years ago

What model did you download ? Can you check that the model is indeed downloaded in the folder ?

ghost commented 4 years ago

I downloaded Glove & Word2vec having the same issue. Please check screenshots

Screenshot 2020-01-07 at 16 49 51 Screenshot 2020-01-07 at 16 50 01

du-phan commented 4 years ago

Hm it's weird that your model is in a sub-folder. The way we do it is we check for file name to determine the pretrained model's type, so when the plugin see ZQ1Worch, it throwback an error because that's not a legitimate model folder.

By default when running the macro, it download the model in the right place, so it is weird that it's not the case with you. You can try to rerun the macro with a new folder

ghost commented 4 years ago

Thanks removed from folder and worked!

du-phan commented 4 years ago

Ok great! Do you still have the problem with the smart_open package ?

ghost commented 4 years ago

No the word2vec and glove works fine, I have a problem with the fast text!

du-phan commented 4 years ago

is it the error in your 1st message ?

ghost commented 4 years ago

My Error sorry worked fine! On 9 Jan 2020, 11:39 +0200, Du Phan notifications@github.com, wrote:

is it the error in your 1st message ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

du-phan commented 4 years ago

Good to hear, should I close this ticket ?

ghost commented 4 years ago

Yes!

Thanks! On 9 Jan 2020, 19:20 +0200, Du Phan notifications@github.com, wrote:

Good to hear, should I close this ticket ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

du-phan commented 4 years ago

great!