[BUG] 1. TypeError: 'Results' object is not iterable. 2. TypeError: a bytes-like object is required, not 'str'

XikunHuang commented 4 years ago

Describe the bug I have installed EvalNE, OpenNE library, PRUNE and Metapath2Vec following the instructions. When I run _evaluator_example.py_, I encounter several errors and warnings.

TypeError: 'Results' object is not iterable.
TypeError: a bytes-like object is required, not 'str'
ERROR:root:No test edges in trainvalid_split. Recomputing correct split...
WARNING:root:Output of method metapath2vec++ contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 705.

To Reproduce Steps to reproduce the error:

OS used: Ubuntu 18.04.1 LTS
EvalNE Version: 0.3.1
Snippet of code executed (for API) or conf file run (for CLI)
```
cd examples/
python3 evaluator_example.py
```
Full error output
- Error 1 Traceback (most recent call last): File "evaluator_example.py", line 185, in main() File "evaluator_example.py", line 67, in main eval_other(nee, scoresheet) File "evaluator_example.py", line 153, in eval_other for res in results: TypeError: 'Results' object is not iterable

Error 2 Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/examples/evaluator_example.py", line 185, in main() File "/home/huangxk/workspace_python/embedding/EvalNE/examples/evaluator_example.py", line 70, in main scoresheet.write_tabular(filename=os.path.join(outpath, 'eval_output.txt'), metric='auroc') File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne/evaluation/score.py", line 204, in write_tabular df.to_csv(f, sep='\t', na_rep='NA') File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/core/generic.py", line 3228, in to_csv formatter.save() File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 202, in save self._save() File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 310, in _save self._save_header() File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 278, in _save_header writer.writerow(encoded_labels) TypeError: a bytes-like object is required, not 'str'
Error 3 Preprocessing graph... Repetition 0 of experiment Evaluating baselines... Evaluating Embedding methods... ERROR:root:No test edges in trainvalid_split. Recomputing correct split... Running command...
Warning 4 WARNING:root:Output of method metapath2vec++ contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 705. WARNING:root:Output provided by method metapath2vec++ contains 129 columns, 128 expected! Taking first column as nodeID... WARNING:root:Output of method node2vec contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 704. WARNING:root:Output provided by method node2vec contains 129 columns, 128 expected! Taking first column as nodeID... WARNING:root:Output of method deepwalk contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 703, obtained lines 704. WARNING:root:Output provided by method deepwalk contains 129 columns, 128 expected! Taking first column as nodeID...

My solutions I have tried to solve Error 1 and Error 2 and it works (but i am not sure whether it is the right solution). Although Error 3 and Warning 4 are WARNING, I want to know the reason and whether I should ignore them or not.

My solution to TypeError: 'Results' object is not iterable. In file evaluator_example.py, line 149-150, line 173-174 :
```
for res in results:
  scoresheet.log_results(res)
```
change them to
```
scoresheet.log_results(results)
```
My solution to TypeError: a bytes-like object is required, not 'str' It seems that this error is caused by text/binary mode. This question in stackoverflow may be helpful. So I tried to change the source code of evalne: in file evalne/evaluation/score.py, line 201-202
```
f = open(filename, 'a+b')
f.write(header.encode())
```
change them to
```
f = open(filename, 'a')
f.write(header)
```
ERROR:root:No test edges in trainvalid_split. Recomputing correct split... Why this error message? Should I ignore it?
Warning 4 It seems that they are related to OpenNE?

Result Even with Error 3 and Warning 4, I still get result file in example/output/eval_output.txt

Evaluation results (auroc):
-----------------------
    network
random_prediction   0.4942
common_neighbours   0.8458
jaccard_coefficient 0.7255
adamic_adar_index   0.8551
preferential_attachment 0.9376
resource_allocation_index   0.853
PRUNE   0.8299
metapath2vec++  0.8218
node2vec    0.8796
deepwalk    0.8603
line    0.8997

Is this correct?

Desktop (please complete the following information):

OS: Ubuntu 18.04.1 LTS
EvalNE Version : 0.3.1
Python: 3.6.7

Thanks for sharing this great library. I am learning to use it. Best, Xikun

Dru-Mara commented 4 years ago

Hi Xikun,

First of all, thank you for the detailed bug report, It's very easy to follow. Regarding the errors/warnings you are encountering:

Error 1: This is indeed a bug in the example, we made some changes to the library and missed to update this example accordingly, sorry about that. I'll push a fix for it in a few minutes. Your solution of removing that for loop is indeed correct :)

Error 2: This is also a bug and your solution should work fine for py3, however, I'm not entirely sure it will on py2. I'll need a bit more time to look into it.

Error 3: Although it says Error, this should simply be a warning. It basically tells you that you have not selected a train/validation split, so the library will compute one for you. This train/validation will have a fixed 90/10 split and otherwise the same parameters as your train/test split. The train/validation split is necessary in order to tune the hyperparameters of node2vec (as you can see tune_params is set to tuning p and q). In the next update, I will include an explicit train/validation edge split so the "error" disappears. I'll also make it a warning.

Warning 4: The warnings (which should be made a bit more clear) basically tell you the following:

_WARNING:root:Output of method metapath2vec++ contains 2 more lines than expected. Will consider them part of the header and ignore them... Expected numlines 703, obtained lines 705.

In this case, metapath2vec returned an embedding file with 705 lines. Out of those 703 lines were identified as embedding vectors corresponding to graph nodes (your graph after being preprocessed contained 703 nodes). The remaining two lines in the file were considered to be header lines and thus ignored by EvalNE. Most methods return header lines in the output files, so you will see that warning a lot. Finally, metapath indeed returns two header lines.

WARNING:root:Output provided by method metapath2vec++ contains 129 columns, 128 expected! Taking first column as nodeID...

This warning tells you that the output embedding file of metapath contained one more column than expected. In this case, you asked for 128-dimensional embeddings but EvalNE found 129 columns in the file. The library will automatically take the first of those columns as the nodeID and the remaining ones as the actual embeddings of nodes. The reason for having this warning is that there are two types of behaviours for NE methods, they either return the embeddings as:

NodeID0, x00, x01, ... x0d NodeID1, x10, x11, ... x1d ... NodeIDn, xn0, xn1, ... xnd

or

x01, x01, ... x0d x11, x11, ... x1d ... xn1, xn1, ... xnd

The warning basically tells you that metapath is a method that returns the data as in the first example and not as in the second one.

Finally, the results you are getting seem correct to me. I hope this helps, and thanks again for pointing out those bugs!

Alex

XikunHuang commented 4 years ago

Hi, Alex Thanks for your quick and detailed reply. It really helps!

I encounter a new error when I run _examples/node2vec/confnode2vec.ini Describe the bug FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp' OSError: Execution of method node2vec did not generate node embeddings file. Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters...

To Reproduce

OS used: Ubuntu18.0.4
EvalNE Version : 0.3.1
Snippet of code executed (for API) or conf file run (for CLI) In file _examples/node2vec/confnode2vec.ini
- I have set thee correct dataset paths and method paths
- I replace _EDGE_EMBEDDINGMETHODS = average with _EDGE_EMBEDDINGMETHODS = hadamard Then run:
```
python3 evalne ./examples/node2vec/conf_node2vec.ini 
```
Full error output This error is weird. Because "Repetition 0 of experiment" and "Repetition 1 of experiment" are OK. This error occurs during "Repetition 2 of experiment".

#################### Error message in file eval.log#############################

10-12-19 09:51:20 - INFO: ------ Repetition 2 of experiment ------ 10-12-19 09:53:21 - WARNING: Output of method node2vec contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040. ----------------------- many similar WARNING----------------------------- 10-12-19 10:00:55 - WARNING: Output provided by method node2vec contains 129 columns, 128 expected! Taking first column as nodeID... 10-12-19 10:01:00 - FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp' Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 457, in _evaluate_ne_cmd X = pp.read_node_embeddings(tmpemb, data_split.TG.nodes, self.dim, output_delim, method_name) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 174, in read_node_embeddings emb_skiprows = infer_header(input_path, len(nodes), method_name) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 123, in infer_header numlines = sum(1 for in open(input_path)) FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 313, in evaluate_cmd write_weights, write_dir, verbose) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 468, in _evaluate_ne_cmd '\nSetting verbose=True can provide more information.'.format(method_name)) OSError: Execution of method node2vec did not generate node embeddings file. Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters... Setting verbose=True can provide more information. 10-12-19 10:02:25 - WARNING: Output of method node2vec contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040.

10-12-19 10:09:29 - WARNING: Output provided by method node2vec contains 129 columns, 128 expected! Taking first column as nodeID... 10-12-19 10:10:22 - ERROR: Exception occurred while evaluating param --p 2 --q 1 for method node2vec on Facebook. Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 457, in _evaluate_ne_cmd X = pp.read_node_embeddings(tmpemb, data_split.TG.nodes, self.dim, output_delim, method_name) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 174, in read_node_embeddings emb_skiprows = infer_header(input_path, len(nodes), method_name) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/utils/preprocess.py", line 123, in infer_header numlines = sum(1 for in open(input_path)) FileNotFoundError: [Errno 2] No such file or directory: './emb.tmp'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 313, in evaluate_cmd write_weights, write_dir, verbose) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 468, in _evaluate_ne_cmd '\nSetting verbose=True can provide more information.'.format(method_name)) OSError: Execution of method node2vec did not generate node embeddings file. Possible reasons: 1) method is not correctly installed or 2) wrong method call or parameters... Setting verbose=True can provide more information. 10-12-19 10:11:12 - WARNING: Output of method node2vec contains 1 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 4039, obtained lines 4040. ----------------------- many similar WARNING----------------------------- 10-12-19 10:22:33 - WARNING: Output provided by method line contains 129 columns, 128 expected! Taking first column as nodeID... 10-12-19 10:22:37 - INFO: ====== Evaluating PPI network ====== 10-12-19 10:22:38 - INFO: ------ Repetition 0 of experiment ------ 10-12-19 10:24:26 - WARNING: Output of method node2vec contains 188 more lines than expected. Will consider them part of the header and ignore them... Expected num_lines 3852, obtained lines 4040. 10-12-19 10:24:26 - WARNING: Output provided by method node2vec contains 129 columns, 128 expected! Taking first column as nodeID...

#######################Error message in terminal######################## python3 ./node2vec_python3/src/main.py --input ./edgelist.tmp --output ./emb.tmp --dimensions 128 --walk-length 80 --num-walks 10 --window-size 10 --workers 8 --p 1 --q 1 --p 0.25 --q 0.25 Walk iteration: 1 / 10 2 / 10 3 / 10 4 / 10 5 / 10 6 / 10 7 / 10 8 / 10 9 / 10 10 / 10 Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "evalne/main.py", line 329, in main() File "evalne/main.py", line 42, in main evaluate(setup) File "evalne/main.py", line 148, in evaluate lp_coef = eval_other(setup, nee, i, scoresheet, repeat, nw_outpath) File "evalne/main.py", line 266, in eval_other write_dir=setup.write_dir_other[j], verbose=setup.verbose) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 313, in evaluate_cmd write_weights, write_dir, verbose) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 462, in _evaluate_ne_cmd results.append(self.evaluate_ne(data_split=data_split, X=X, method=method_name, edge_embed_method=ee)) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 641, in evaluate_ne tr_edge_embeds, te_edge_embeds = self.compute_ee(data_split, X, edge_embed_method) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/evaluator.py", line 677, in compute_ee tr_edge_embeds = func(X, data_split.train_edges) File "/home/huangxk/workspace_python/embedding/EvalNE/venv_for_evlne/lib/python3.6/site-packages/evalne-0.3.1-py3.6.egg/evalne/evaluation/edge_embeddings.py", line 59, in hadamard edge_embeds[i] = X[str(edge[0])] * X[str(edge[1])] KeyError: '1413' Progress on lp task: 33%|███████████████████████████████████████████████▎ | 3/9 [1:17:39<2:35:18, 1553.13s/it]

By the way

Before testing "hadamard" method, I have tested "average" method twice using conf_node2vec.ini. It is weird that I encounter the above errors in the first time, and succeed in the second time (I do nothing).

Thanks again for your time. Xikun

Dru-Mara commented 4 years ago

Hi,

I can not replicate the error, could you please try running the evaluation as shown below, and let me know if it solves the issue?: python3 -m evalne ./examples/node2vec/conf_node2vec.ini

I found that without the -m parameter some strange error might occur. Also, have you run multiple evaluations in the same directory? If so, that could explain the error you are getting. Or is it possible that the emb.tmp file generated by the library got somehow deleted?

Edit: Going over the log again I noticed a few things: 1) the error is not related to EvalNE, but to Node2vec 2) It seems that node2vec is not generating the correct embeddings file 'emb.tmp' and thus the library is not able to find it and read it 3) I noticed that you are running node2vec using python3. The original node2vec repo only mentions python2 so it is possible that the code is unstable when executed with python3 (especially the gensim library part). This might cause some executions to run successfully and others to fail. I would recommend creating a python2 virtualenv for node2vec and installing all the dependencies there, then in the conf file you would just need to specify something like: python2 ./node2vec_python2/src/main.py ..... 4) Also, a quick explanation of what is going on in the conf file: The library will try to evaluate these methods: _NAMESOTHER = node2vec deepWalk line The command line calls corresponding to each method are in order: _METHODSOTHER = python ../../../methods/node2vec/main.py ... python ../../../methods/node2vec/main.py ... --p 1 --q 1 ../../../methods/LINE/linux/line ... Note that deepwalk is node2vec with p=1 and q=1, so we specify them directly in the second line of the METHODS_OTHER variable. For the first method listed in NAMES_OTHER, in this case node2vec, we will tune hyperparameters using grid search: _TUNE_PARAMSOTHER = --p 0.25 0.5 1 2 4 --q 0.25 0.5 1 2 4 If there were a second line in TUNE_PARAMS_OTHER it would be assumed to contain the parameters you want to tune for deepwalk. If there were a third line in the variable it would be assumed to refer to LINE. 5) Keep in mind that it's perfectly fine to call some methods with python2 and others with python3 in the EvalNE conf files

I've never tried to run the original node2vec code using py3, but I did have similar issues with other methods/libraries. Please, let me know if running the method with py2 solves the issues. Alex

XikunHuang commented 4 years ago

Hi Alex, Your answer really helps!

I will try running the evaluation with -m parameter and tell you the result (it takes some time).
Yes, I do run multiple evaluations in the same directory when testing "average" method. In directory EvalNE/, I run two evaluations at the same time.
```
python3 evalne ./examples/node2vec/conf_node2vec.ini 
```
If I want to run two evaluations at the same time, I should run the command in different directory, right? i.e.
```
In directory EvalNE/one_dir/
python3 -m evalne  relative/path/to/conf_node2vec.ini 

In directory EvalNE/another_dir/ 
python3 -m evalne  relative/path/to/conf_node2vec.ini 
```
Yes, I am running node2vec using python3. I have modified original node2vec so that it can work with py3 following this comment(https://github.com/aditya-grover/node2vec/issues/35#issuecomment-382930579) Keep in mind that it's perfectly fine to call some methods with python2 and others with python3 in the EvalNE conf files This really helps! I will try to run original node2vec using py2 and tell you the result later.

Thanks. Xikun

Dru-Mara commented 4 years ago

Hello,

Yes, you should run the evaluations from different directories as you mentioned. The library generates some temporal files which are used to communicate information to the methods executed, so, if several evaluations are run in the same folder the different processes will start messing up with each other's temporal files. I will modify this behaviour for the future versions of the library and make it possible to run many evaluations in the same folder.

Alex

XikunHuang commented 4 years ago

Hi,

Running the evaluation with -m parameter and py2 node2vec solves the issues. Thanks for your help.

Xikun

Dru-Mara commented 4 years ago

Hi Xikun,

Great to hear! Since the original issue seems to be resolved I'll close it, but please let us know if you have any other issues or suggestions :)

Alex

Dru-Mara / EvalNE

[BUG] 1. TypeError: 'Results' object is not iterable. 2. TypeError: a bytes-like object is required, not 'str' #8