Closed 2017wxyzwxyz closed 4 years ago
You should run the notebook of the same version as your analytics-zoo. If you installed analytics-zoo with pip install, currently the last version released is 0.8.1, so you should run the notebook on branch-0.8. If you want to try the latest version on master, since it hasn't been released yet, you need to build a whl based on master code and manually install the whl (You can follow the steps here).
But run the old version 'NYC taxi dataset.ipynb' on master, the following error occurred too:
from zoo.automl.common.util import split_input_df train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
NameError Traceback (most recent call last)
But run the old version 'NYC taxi dataset.ipynb' on master, the following error occurred too:
from zoo.automl.common.util import split_input_df train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/bigdl/share/conf/spark-bigdl.conf to sys.path
Adding /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/lib/analytics-zoo-bigdl_0.10.0-spark_2.4.3-0.8.1-jar-with-dependencies.jar to BIGDL_JARS Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/conf/spark-analytics-zoo.conf to sys.path NameError Traceback (most recent call last) in 1 from zoo.automl.common.util import split_input_df ----> 2 train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
NameError: name 'df' is not defined
Let me ask you two questions: 一:I do installed ’analytics-zoo‘ (It seems to have been downloaded on March 6th) with ‘pip install’,so I should run the notebook on branch-0.8,but where is the 'NYC taxi dataset.ipynb' on branch-0.8 ?
二:In the future, I would like to write code on your platform to complete the work, but I am not familiar with data research and python language. So, can I ask you if the development language of this large unified platform can only use Python instead of learning other languages like Scala?I usually work with C / C + +, MATLAB.
As for the error, again, the notebook and the installed library should be consistent. If you installed the master, you need to use the notebook in master, If you install from "pip install", find out the version of zoo which you installed, and download the notebook from the same branch. e.g. if you installed 0.8.1, you can download the notebook from https://github.com/intel-analytics/analytics-zoo/tree/branch-0.8.
The notebook is still as the same place. Since you didn't say which one you're using now, I assume you're using the notebook in the automl folder. You can still find it in 0.8 branch at https://github.com/intel-analytics/analytics-zoo/blob/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb
C/C++ or MATLAB is not supported. Most of the deep learning frameworks are using python interfaces so we support python. Some of the models have both scala and python interface.
To use master, you can also pip install our nighly build packages. And those libs will contain the latest updates of code and you can use together with master version of notebook. Refer to https://analytics-zoo.github.io/master/#PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip for details of how to install latest nightly built packages.
But run the old version 'NYC taxi dataset.ipynb' on master, the following error occurred too:
from zoo.automl.common.util import split_input_df train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/bigdl/share/conf/spark-bigdl.conf to sys.path
Adding /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/lib/analytics-zoo-bigdl_0.10.0-spark_2.4.3-0.8.1-jar-with-dependencies.jar to BIGDL_JARS Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/conf/spark-analytics-zoo.conf to sys.path NameError Traceback (most recent call last) in 1 from zoo.automl.common.util import split_input_df ----> 2 train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
NameError: name 'df' is not defined
Have you downloaded the nyc_taxi dataset as said in "Run Jupyter" part in nyc taxi readme? Are there any other error messages before?
Yes,but I downloaded the nyc_taxi dataset Directly through the web address, not by running script files,and I put the 'nyc_taxi.csv' file to current dir, code is changed as following:
try:
#raw_df = pd.read_csv(dataset_path)
raw_df = pd.read_csv("nyc_taxi.csv")
except Exception as e: print("nyc_taxi.csv doesn't exist") print("you can run $ANALYTICS_ZOO_HOME/bin/data/NAB/nyc_taxi/get_nyc_taxi.sh to download nyc_taxi.csv")
There was no other error messages .
Yes,but I downloaded the nyc_taxi dataset Directly through the web address, not by running script files,and I put the 'nyc_taxi.csv' file to current dir, code is changed as following:
try:
dataset_path = os.getenv("ANALYTICS_ZOO_HOME")+"/bin/data/NAB/nyc_taxi/nyc_taxi.csv"
raw_df = pd.read_csv(dataset_path)
raw_df = pd.read_csv("nyc_taxi.csv") except Exception as e: print("nyc_taxi.csv doesn't exist") print("you can run $ANALYTICS_ZOO_HOME/bin/data/NAB/nyc_taxi/get_nyc_taxi.sh to download nyc_taxi.csv")
There was no other error messages .
This seems to be our master version. The error message said that "df" is not defined. Could you please check if you have executed the cell that assign value to "df" correctly before using "df"?
I do installed ’analytics-zoo‘ with ‘pip install’,so I should run the notebook on branch-0.8,According to your instructions, download two files, 'https://github.com/intel-analytics/analytics-zoo/tree/branch-0.8.' and 'https://github.com/intel-analytics/analytics-zoo/blob/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb' open the nyc_taxi_dataset.ipynb, but open them with the following same error:
Error loading notebook Unreadable Notebook: /mnt/f/zooAutoml/jupyterCode/22/nyc_taxi_dataset-7-21-2.ipynb NotJSONError('Notebook does not appear to be JSON: \'\n\n\n\n\n<!DOCTYPE html>\n<html lang="...',)
This seems to be our master version. The error message said that "df" is not defined. Could you please check if you have executed the cell that assign value to "df" correctly before using "df"?
This is the old Version‘s run result.
Today I open the new nyc_taxi_dataset.ipynb,it's code contain sentence 'df = pd.DataFrame(pd.to_datetime(raw_df.timestamp))' ,df.head() print is:
0 | 2014-07-01 00:00:00 | 10844 |
---|---|---|
2014-07-01 00:30:00 | 8127 | |
2014-07-01 01:00:00 | 6210 | |
2014-07-01 01:30:00 | 4656 | |
2014-07-01 02:00:00 | 3820 |
the next sentence
'from zoo.automl.common.util import train_val_test_split train_df, val_df, test_df = train_val_test_split(df, val_ratio=0.1, test_ratio=0.1)'
run error as following:
Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/bigdl/share/conf/spark-bigdl.conf to sys.path Adding /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/lib/analytics-zoo-bigdl_0.10.0-spark_2.4.3-0.8.1-jar-with-dependencies.jar to BIGDL_JARS Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/conf/spark-analytics-zoo.conf to sys.path
ImportError Traceback (most recent call last)
Again, you should not run the notebook on analytics-zoo master with another version of analytics-zoo installed (e.g 0.8.1). If you do want to run the new notebook on master, you could try installing our nightly build packages as suggested before.
To use master, you can also pip install our nightly build packages. And those libs will contain the latest updates of code and you can use together with master version of notebook. Refer to https://analytics-zoo.github.io/master/#PythonUserGuide/install/#install-the-latest-nightly-build-wheels-for-pip for details of how to install latest nightly built packages.
As you can see, in version 0.8.1, the util function name is split_input_df
. And on master, the function name has changed to train_val_test_split
. Therefore you need to use the same version of zoo to run the corresponding notebook.
NameError Traceback (most recent call last) in 1 from zoo.automl.common.util import split_input_df ----> 2 train_df, val_df, test_df = split_input_df(df, val_split_ratio=0.1, test_split_ratio=0.1)
NameError: name 'df' is not defined
The error message above is from version 0.8.1. But when I asked whether you have downloaded the dataset correctly, you gave me the code from the new notebook as below.
Yes,but I downloaded the nyc_taxi dataset Directly through the web address, not by running script files,and I put the 'nyc_taxi.csv' file to current dir, code is changed as following:
try:
dataset_path = os.getenv("ANALYTICS_ZOO_HOME")+"/bin/data/NAB/nyc_taxi/nyc_taxi.csv"
raw_df = pd.read_csv(dataset_path)
raw_df = pd.read_csv("nyc_taxi.csv") except Exception as e: print("nyc_taxi.csv doesn't exist") print("you can run $ANALYTICS_ZOO_HOME/bin/data/NAB/nyc_taxi/get_nyc_taxi.sh to download nyc_taxi.csv")
There was no other error messages .
So I am a little confused while you told me that
This is the old Version‘s run result.
So have you downloaded the dataset correctly for version 0.8.1 notebook? Is there still NameError? Please make sure you are running the same version notebook as your analytics-zoo version.
I do installed ’analytics-zoo‘ with ‘pip install’,so I should run the notebook on branch-0.8,According to your instructions, download two files, 'https://github.com/intel-analytics/analytics-zoo/tree/branch-0.8.' and 'https://github.com/intel-analytics/analytics-zoo/blob/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb' open the nyc_taxi_dataset.ipynb, but open them with the following same error:
Error loading notebook Unreadable Notebook: /mnt/f/zooAutoml/jupyterCode/22/nyc_taxi_dataset-7-21-2.ipynb NotJSONError('Notebook does not appear to be JSON: '\n\n\n\n\n\n<html lang="...',)
How did you download the notebook?
You could download the notebook via
wget https://raw.githubusercontent.com/intel-analytics/analytics-zoo/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb
Oh, I downloaded it with a browser, not with the WGet command.
But download ‘nyc_taxi_dataset.ipynb’according to the instructions, there is an error: (base) wxy@SC-202007131929:~$ wget https://raw.githubusercontent.com/intel-analytics/analytics-zoo/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb --2020-07-21 13:50:49-- https://raw.githubusercontent.com/intel-analytics/analytics-zoo/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 0.0.0.0, :: Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|0.0.0.0|:443... failed: Connection refused. Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|::|:443... failed: Connection refused. (base) wxy@SC-202007131929:~$
You could download from browser, but you need to download the raw file. First go to our notebook on branch-0.8 Then click "Raw" Then save as file.
title :analytics-zoo/apps/automl/nyc_taxi_dataset.ipynb
Sorry, something went wrong. Reload?
after click "Raw"
页面无法显示(ERR_NAME_NOT_RESOLVED) 按 F5 刷新网页,或尝试操作:打开浏览器医生检查代理服务器设置
Please send the 'notebook on branch-0.8' to my email testmywebsite@163.com
Please send the 'notebook on branch-0.8-nyc_taxi_dataset.ipynb' to my email testmywebsite@163.com
I try use following address open the 'notebook on branch-0.8-nyc_taxi_dataset.ipynb' https://nbviewer.jupyter.org/github/intel-analytics/analytics-zoo/blob/branch-0.8/apps/automl/nyc_taxi_dataset.ipynb
The create an new *.ipynb file ,copy the open content to this new one a bit bybit,then run it ,the following error occurred:
........................ code ................................. from zoo import init_spark_on_local from zoo.ray import RayContext sc = init_spark_on_local(cores=4) ray_ctx = RayContext(sc=sc, object_store_memory="1g") ray_ctx.init()
........................ run error .................................
Current pyspark location is : /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/pyspark/init.py Start to getOrCreate SparkContext
Exception Traceback (most recent call last)
Could you please open another issue since there is a different error? One specific error within one issue will be more convenient for other users to refer.
Certainly! I'm not familiar with GitHub usage.
When I running the latest version 'NYC taxi dataset.ipynb',the following error occurred:
from zoo.automl.common.util import train_val_test_split train_df, val_df, test_df = train_val_test_split(df, val_ratio=0.1, test_ratio=0.1)
Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/bigdl/share/conf/spark-bigdl.conf to sys.path Adding /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/lib/analytics-zoo-bigdl_0.10.0-spark_2.4.3-0.8.1-jar-with-dependencies.jar to BIGDL_JARS Prepending /home/wxy/anaconda3/envs/ZooAutoml/lib/python3.6/site-packages/zoo/share/conf/spark-analytics-zoo.conf to sys.path
ImportError Traceback (most recent call last) in ----> 1 from zoo.automl.common.util import train_val_test_split 2 train_df, val_df, test_df = train_val_test_split(df, val_ratio=0.1, test_ratio=0.1)
ImportError: cannot import name 'train_val_test_split'