Closed XikunHuang closed 4 years ago
Hi Xikun,
First of all, thank you for the detailed bug report, It's very easy to follow. Regarding the errors/warnings you are encountering:
Error 1: This is indeed a bug in the example, we made some changes to the library and missed to update this example accordingly, sorry about that. I'll push a fix for it in a few minutes. Your solution of removing that for loop is indeed correct :)
Error 2: This is also a bug and your solution should work fine for py3, however, I'm not entirely sure it will on py2. I'll need a bit more time to look into it.
Error 3: Although it says Error, this should simply be a warning. It basically tells you that you have not selected a train/validation split, so the library will compute one for you. This train/validation will have a fixed 90/10 split and otherwise the same parameters as your train/test split. The train/validation split is necessary in order to tune the hyperparameters of node2vec (as you can see tune_params is set to tuning p and q). In the next update, I will include an explicit train/validation edge split so the "error" disappears. I'll also make it a warning.
Warning 4: The warnings (which should be made a bit more clear) basically tell you the following:
_WARNING:root:Output of method metapath2vec++ contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected numlines 703, obtained lines 705.
In this case, metapath2vec returned an embedding file with 705 lines. Out of those 703 lines were identified as embedding vectors corresponding to graph nodes (your graph after being preprocessed contained 703 nodes). The remaining two lines in the file were considered to be header lines and thus ignored by EvalNE. Most methods return header lines in the output files, so you will see that warning a lot. Finally, metapath indeed returns two header lines.
WARNING:root:Output provided by method metapath2vec++ contains 129 columns, 128 expected! Taking first column as nodeID...
This warning tells you that the output embedding file of metapath contained one more column than expected. In this case, you asked for 128-dimensional embeddings but EvalNE found 129 columns in the file. The library will automatically take the first of those columns as the nodeID and the remaining ones as the actual embeddings of nodes. The reason for having this warning is that there are two types of behaviours for NE methods, they either return the embeddings as:
NodeID0, x00, x01, ... x0d NodeID1, x10, x11, ... x1d ... NodeIDn, xn0, xn1, ... xnd
or
x01, x01, ... x0d x11, x11, ... x1d ... xn1, xn1, ... xnd
The warning basically tells you that metapath is a method that returns the data as in the first example and not as in the second one.
Finally, the results you are getting seem correct to me. I hope this helps, and thanks again for pointing out those bugs!
Alex
Hi, Alex Thanks for your quick and detailed reply. It really helps!
I encounter a new error when I run _examples/node2vec/confnode2vec.ini
Describe the bug
FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'
OSError: Execution of method node2vec
did not generate node embeddings file.
Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters...
To Reproduce
python3 evalne ./examples/node2vec/conf_node2vec.ini
#################### Error message in file eval.log#############################
10-12-19 09:51:20 - INFO: ------ Repetition 2 of experiment ------
10-12-19 09:53:21 - WARNING: Output of method node2vec
contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040.
----------------------- many similar WARNING-----------------------------
10-12-19 10:00:55 - WARNING: Output provided by method node2vec
contains 129 columns, 128 expected! Taking first column as nodeID...
10-12-19 10:01:00 - FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'
Traceback (most recent call last):
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 457, in _evaluate_ne_cmd
X = pp.read_node_embeddings(tmpemb, data_split.TG.nodes, self.dim, output_delim, method_name)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 174, in read_node_embeddings
emb_skiprows = infer_header(input_path, len(nodes), method_name)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 123, in infer_header
numlines = sum(1 for in open(input_path))
FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 313, in evaluate_cmd
write_weights, write_dir, verbose)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 468, in _evaluate_ne_cmd
'\nSetting verbose=True can provide more information.'.format(method_name))
OSError: Execution of method node2vec
did not generate node embeddings file.
Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters...
Setting verbose=True can provide more information.
10-12-19 10:02:25 - WARNING: Output of method node2vec
contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040.
10-12-19 10:09:29 - WARNING: Output provided by method node2vec
contains 129 columns, 128 expected! Taking first column as nodeID...
10-12-19 10:10:22 - ERROR: Exception occurred while evaluating param --p 2 --q 1
for method node2vec
on Facebook
.
Traceback (most recent call last):
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 457, in _evaluate_ne_cmd
X = pp.read_node_embeddings(tmpemb, data_split.TG.nodes, self.dim, output_delim, method_name)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 174, in read_node_embeddings
emb_skiprows = infer_header(input_path, len(nodes), method_name)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 123, in infer_header
numlines = sum(1 for in open(input_path))
FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 313, in evaluate_cmd
write_weights, write_dir, verbose)
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 468, in _evaluate_ne_cmd
'\nSetting verbose=True can provide more information.'.format(method_name))
OSError: Execution of method node2vec
did not generate node embeddings file.
Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters...
Setting verbose=True can provide more information.
10-12-19 10:11:12 - WARNING: Output of method node2vec
contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040.
----------------------- many similar WARNING-----------------------------
10-12-19 10:22:33 - WARNING: Output provided by method line
contains 129 columns, 128 expected! Taking first column as nodeID...
10-12-19 10:22:37 - INFO: ====== Evaluating PPI network ======
10-12-19 10:22:38 - INFO: ------ Repetition 0 of experiment ------
10-12-19 10:24:26 - WARNING: Output of method node2vec
contains 188 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 3852, obtained lines 4040.
10-12-19 10:24:26 - WARNING: Output provided by method node2vec
contains 129 columns, 128 expected! Taking first column as nodeID...
#######################Error message in terminal########################
python3 ./node2vec_python3/src/main.py --input ./edgelist.tmp --output ./emb.tmp --dimensions 128 --walk-length 80 --num-walks 10 --window-size 10 --workers 8 --p 1 --q 1 --p 0.25 --q 0.25
Walk iteration:
1 / 10
2 / 10
3 / 10
4 / 10
5 / 10
6 / 10
7 / 10
8 / 10
9 / 10
10 / 10
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "evalne/main.py", line 329, in
By the way
Thanks again for your time. Xikun
Hi,
I can not replicate the error, could you please try running the evaluation as shown below, and let me know if it solves the issue?:
python3 -m evalne ./examples/node2vec/conf_node2vec.ini
I found that without the -m parameter some strange error might occur. Also, have you run multiple evaluations in the same directory? If so, that could explain the error you are getting. Or is it possible that the emb.tmp file generated by the library got somehow deleted?
Edit: Going over the log again I noticed a few things:
1) the error is not related to EvalNE, but to Node2vec
2) It seems that node2vec is not generating the correct embeddings file 'emb.tmp' and thus the library is not able to find it and read it
3) I noticed that you are running node2vec using python3. The original node2vec repo only mentions python2 so it is possible that the code is unstable when executed with python3 (especially the gensim library part). This might cause some executions to run successfully and others to fail. I would recommend creating a python2 virtualenv for node2vec and installing all the dependencies there, then in the conf file you would just need to specify something like:
python2 ./node2vec_python2/src/main.py .....
4) Also, a quick explanation of what is going on in the conf file:
The library will try to evaluate these methods:
_NAMESOTHER = node2vec deepWalk line
The command line calls corresponding to each method are in order:
_METHODSOTHER =
python ../../../methods/node2vec/main.py ...
python ../../../methods/node2vec/main.py ... --p 1 --q 1
../../../methods/LINE/linux/line ...
Note that deepwalk is node2vec with p=1 and q=1, so we specify them directly in the second line of the METHODS_OTHER variable.
For the first method listed in NAMES_OTHER, in this case node2vec, we will tune hyperparameters using grid search:
_TUNE_PARAMSOTHER = --p 0.25 0.5 1 2 4 --q 0.25 0.5 1 2 4
If there were a second line in TUNE_PARAMS_OTHER it would be assumed to contain the parameters you want to tune for deepwalk. If there were a third line in the variable it would be assumed to refer to LINE.
5) Keep in mind that it's perfectly fine to call some methods with python2 and others with python3 in the EvalNE conf files
I've never tried to run the original node2vec code using py3, but I did have similar issues with other methods/libraries. Please, let me know if running the method with py2 solves the issues. Alex
Hi Alex, Your answer really helps!
Yes, I do run multiple evaluations in the same directory when testing "average" method. In directory EvalNE/, I run two evaluations at the same time.
python3 evalne ./examples/node2vec/conf_node2vec.ini
If I want to run two evaluations at the same time, I should run the command in different directory, right? i.e.
In directory EvalNE/one_dir/
python3 -m evalne relative/path/to/conf_node2vec.ini
In directory EvalNE/another_dir/
python3 -m evalne relative/path/to/conf_node2vec.ini
Thanks. Xikun
Hello,
Yes, you should run the evaluations from different directories as you mentioned. The library generates some temporal files which are used to communicate information to the methods executed, so, if several evaluations are run in the same folder the different processes will start messing up with each other's temporal files. I will modify this behaviour for the future versions of the library and make it possible to run many evaluations in the same folder.
Alex
Hi,
Running the evaluation with -m parameter and py2 node2vec solves the issues. Thanks for your help.
Xikun
Hi Xikun,
Great to hear! Since the original issue seems to be resolved I'll close it, but please let us know if you have any other issues or suggestions :)
Alex
Describe the bug I have installed EvalNE, OpenNE library, PRUNE and Metapath2Vec following the instructions. When I run _evaluator_example.py_, I encounter several errors and warnings.
metapath2vec++
contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 705.To Reproduce Steps to reproduce the error:
Error 2 Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/examples/evaluator_example.py", line 185, in
main()
File "/home/huangxk/workspace_python/embedding/EvalNE/examples/evaluator_example.py", line 70, in main
scoresheet.write_tabular(filename=os.path.join(outpath, 'eval_output.txt'), metric='auroc')
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne/evaluation/score.py", line 204, in write_tabular
df.to_csv(f, sep='\t', na_rep='NA')
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/core/generic.py", line 3228, in to_csv
formatter.save()
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 202, in save
self._save()
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 310, in _save
self._save_header()
File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 278, in _save_header
writer.writerow(encoded_labels)
TypeError: a bytes-like object is required, not 'str'
Error 3 Preprocessing graph... Repetition 0 of experiment Evaluating baselines... Evaluating Embedding methods... ERROR:root:No test edges in trainvalid_split. Recomputing correct split... Running command...
Warning 4 WARNING:root:Output of method metapath2vec++ contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 705. WARNING:root:Output provided by method
metapath2vec++
contains 129 columns, 128 expected! Taking first column as nodeID... WARNING:root:Output of methodnode2vec
contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 704. WARNING:root:Output provided by methodnode2vec
contains 129 columns, 128 expected! Taking first column as nodeID... WARNING:root:Output of methoddeepwalk
contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 704. WARNING:root:Output provided by methoddeepwalk
contains 129 columns, 128 expected! Taking first column as nodeID...My solutions I have tried to solve Error 1 and Error 2 and it works (but i am not sure whether it is the right solution). Although Error 3 and Warning 4 are WARNING, I want to know the reason and whether I should ignore them or not.
My solution to TypeError: 'Results' object is not iterable. In file evaluator_example.py, line 149-150, line 173-174 :
change them to
My solution to TypeError: a bytes-like object is required, not 'str' It seems that this error is caused by text/binary mode. This question in stackoverflow may be helpful. So I tried to change the source code of evalne: in file evalne/evaluation/score.py, line 201-202
change them to
ERROR:root:No test edges in trainvalid_split. Recomputing correct split... Why this error message? Should I ignore it?
Warning 4 It seems that they are related to OpenNE?
Result Even with Error 3 and Warning 4, I still get result file in example/output/eval_output.txt
Is this correct?
Desktop (please complete the following information):
Thanks for sharing this great library. I am learning to use it. Best, Xikun