FenTechSolutions / CausalDiscoveryToolbox

Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html
MIT License
1.12k stars 197 forks source link

Trying to run GES #82

Open ManishaSharma-1 opened 3 years ago

ManishaSharma-1 commented 3 years ago

Hi Team, I am getting this error:

R Python Error Output

[Errno 2] File /tmp/cdt_ges_2755b76f-a9cc-40ad-b058-1b1c01cc5c21/result.csv does not exist: '/tmp/cdt_ges_2755b76f-a9cc-40ad-b058-1b1c01cc5c21/result.csv'

Trying to run below snippet

import networkx as nx from cdt.causality.graph import GES from cdt.data import load_dataset data, graph = load_dataset("sachs") obj = GES()

The predict() method works without a graph, or with a

directed or udirected graph provided as an input

output = obj.predict(data) #No graph provided as an argument

output = obj.predict(data, nx.Graph(graph)) #With an undirected graph

output = obj.predict(data, graph) #With a directed graph

To view the graph created, run the below commands:

nx.draw_networkx(output, font_size=8) plt.show()

diviyank commented 3 years ago

Hello ! Sorry for the delay

Which version of the code are you running ? Could you update to the latest version ? (it includes more detailed traceback)

Best, Diviyan

dcuoliveira commented 3 years ago

I had a similar issue today. The error is as follows

R Python Error Output 
-----------------------

[Errno 2] No such file or directory: '/var/folders/5b/ctj6f2bj6z55_qhnvbh2xgy80000gn/T/cdt_bnlearn_f63747a9-c5d2-4560-a821-8813a38ee9ab/result.csv'
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-39-908d0536cbd1> in <module>
      1 obj = IAMB()
----> 2 output = obj.predict(data)

/opt/anaconda3/lib/python3.8/site-packages/cdt/causality/graph/model.py in predict(self, df_data, graph, **kwargs)
     61         """
     62         if graph is None:
---> 63             return self.create_graph_from_data(df_data, **kwargs)
     64         elif isinstance(graph, nx.DiGraph):
     65             return self.orient_directed_graph(df_data, graph, **kwargs)

/opt/anaconda3/lib/python3.8/site-packages/cdt/causality/graph/bnlearn.py in create_graph_from_data(self, data)
    225         data2 = data.copy()
    226         data2.columns = [i for i in range(data.shape[1])]
--> 227         results = self._run_bnlearn(data2, verbose=self.verbose)
    228         graph = nx.DiGraph()
    229         graph.add_nodes_from(['X' + str(i) for i in range(data.shape[1])])

/opt/anaconda3/lib/python3.8/site-packages/cdt/causality/graph/bnlearn.py in _run_bnlearn(self, data, whitelist, blacklist, verbose)
    262         except Exception as e:
    263             rmtree(run_dir)
--> 264             raise e
    265         except KeyboardInterrupt:
    266             rmtree(run_dir)

/opt/anaconda3/lib/python3.8/site-packages/cdt/causality/graph/bnlearn.py in _run_bnlearn(self, data, whitelist, blacklist, verbose)
    257                 self.arguments['{E_WHITEL}'] = 'FALSE'
    258 
--> 259             bnlearn_result = launch_R_script(Path("{}/R_templates/bnlearn.R".format(os.path.dirname(os.path.realpath(__file__)))),
    260                                              self.arguments, output_function=retrieve_result, verbose=verbose)
    261         # Cleanup

/opt/anaconda3/lib/python3.8/site-packages/cdt/utils/R.py in launch_R_script(template, arguments, output_function, verbose, debug)
    219                 print("\nR Python Error Output \n-----------------------\n")
    220                 print(e)
--> 221                 raise RuntimeError("RProcessError \nR Process Error Output \n-----------------------\n" + str(err, "ISO-8859-1")) from None
    222             print("\nR Python Error Output \n-----------------------\n")
    223             print(e)

RuntimeError: RProcessError 
R Process Error Output 
-----------------------

I tried to set the Rscript path as follows:

import cdt
cdt.SETTINGS.rpath = "/Volumes/disk0s2/Library/Frameworks/R.framework/Resources/bin/R"

I had this issue running the following code:

from cdt.causality.graph import IAMB
from cdt.data import load_dataset

data, graph = load_dataset("sachs")
obj = IAMB()
output = obj.predict(data)

Im running the coding using VScode and Macbook with the following config:

MacOS: BigSur 11.2
Python: 3.8.5
R: 4.0.3 
cdt: 0.5.23
maxwellreynolds commented 2 years ago

I am also running into this issue with GES from the tutorial. Running version 0.5.23 on Mac

aguamentiPatronum commented 2 years ago

Same issue, with both GES and CCDR:

For CCDr: obj = CCDr(verbose=self.verbose) logger.info("About to predict") output = obj.predict(df)

2022-02-15 08:05:20,115 - INFO - About to predict

R Python Error Output

[Errno 2] No such file or directory: '/var/folders/n5/hq13296s7pnbqsrhp_xw40xhmp2w4c/T/cdt_ccdr_2aa6880b-d173-4dc1-ade7-85fd63e57dac/result.csv' Traceback (most recent call last): File "runCDT.py", line 131, in worker.runCDT(args.inputParquet) File "runCDT.py", line 55, in runCDT output = obj.predict(df) File ".../anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/model.py", line 63, in predict return self.create_graph_from_data(df_data, **kwargs) File ".../anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py", line 121, in create_graph_from_data results = self._run_ccdr(data, verbose=self.verbose) File ".../anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py", line 142, in _run_ccdr raise e File ".../anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py", line 138, in _run_ccdr self.arguments, output_function=retrieve_result, verbose=verbose) File ".../anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/utils/R.py", line 221, in launch_R_script raise RuntimeError("RProcessError \nR Process Error Output \n-----------------------\n" + str(err, "ISO-8859-1")) from None RuntimeError: RProcessError R Process Error Output

Loading required package: sparsebnUtils Loading required package: ccdrAlgorithm Loading required package: discretecdAlgorithm

sparsebn v0.1, Copyright (c) 2016-2020 Bryon Aragam, University of Chicago Jiaying Gu, University of California, Los Angeles Dacheng Zhang, University of California, Los Angeles Qing Zhou, University of California, Los Angeles Fei Fu

Please cite our work! Type citation("sparsebn") for details. ---> Bugs? Please report any bugs at https://github.com/itsrainingdata/sparsebn/issues.

Attaching package: ‘MASS’

The following object is masked from ‘package:sparsebnUtils’:

select

A list of interventions was not specified: Assuming data is purely observational. Warning message: In stats::cor(data) : the standard deviation is zero Warning messages: 1: In max(dr, na.rm = TRUE) : no non-missing arguments to max; returning -Inf 2: In max(which(dr >= threshold)) : no non-missing arguments to max; returning -Inf Error in estDAG[[lambda]] : invalid negative subscript in get1index Calls: write.matrix -> as.matrix -> get.adjacency.matrix Execution halted

And when called from within Jupyter Notebook:

R Python Error Output

[Errno 2] No such file or directory: '/var/folders/n5/hq13296s7pnbqsrhp_xw40xhmp2w4c/T/cdt_ccdr_981181e8-2fc1-4691-8206-731f14449426/result.csv'

RuntimeError Traceback (most recent call last)

in ----> 1 G = graphr.runCDT(loc) ~/.../runCDT.py in runCDT(self, inputParquet) 53 obj = CCDr(verbose=self.verbose) 54 logger.info("About to predict") ---> 55 output = obj.predict(df) 56 logger.info("Done predicting") 57 ~/opt/anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/model.py in predict(self, df_data, graph, **kwargs) 61 """ 62 if graph is None: ---> 63 return self.create_graph_from_data(df_data, **kwargs) 64 elif isinstance(graph, nx.DiGraph): 65 return self.orient_directed_graph(df_data, graph, **kwargs) ~/opt/anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py in create_graph_from_data(self, data, **kwargs) 119 # Building setup w/ arguments. 120 self.arguments['{VERBOSE}'] = str(self.verbose).upper() --> 121 results = self._run_ccdr(data, verbose=self.verbose) 122 return nx.relabel_nodes(nx.DiGraph(results), 123 {idx: i for idx, i in enumerate(data.columns)}) ~/opt/anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py in _run_ccdr(self, data, fixedGaps, verbose) 140 except Exception as e: 141 rmtree(run_dir) --> 142 raise e 143 except KeyboardInterrupt: 144 rmtree(run_dir) ~/opt/anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/causality/graph/CCDr.py in _run_ccdr(self, data, fixedGaps, verbose) 136 data.to_csv(Path('{}/data.csv'.format(run_dir)), header=False, index=False) 137 ccdr_result = launch_R_script(Path("{}/R_templates/CCDr.R".format(os.path.dirname(os.path.realpath(__file__)))), --> 138 self.arguments, output_function=retrieve_result, verbose=verbose) 139 # Cleanup 140 except Exception as e: ~/opt/anaconda3/envs/cdt/lib/python3.7/site-packages/cdt/utils/R.py in launch_R_script(template, arguments, output_function, verbose, debug) 222 print("\nR Python Error Output \n-----------------------\n") 223 print(e) --> 224 raise RuntimeError("RProcessError ") from None 225 226 if not debug: RuntimeError: RProcessError
Talivni commented 2 years ago

I'm having the same issue

francescomontagna commented 2 years ago

I have the same issue

diviyank commented 2 years ago

This is strange, i'm not able to replicate this issue, are you using cdt 0.6.0?

njbamboo commented 1 year ago

My envs are:

it gave me the below error messages:

''' Output: GES is ran on the skeleton of the given graph. adjacency_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. ARGUMENT '/var/folders/j6/9thfps154wzblxd7qf8bzmwr0000gn/T/cdt_R_script_ed9d3c4b-ac0b-41be-aab0-23896d0c159c/instance_ges.R' ignored ... R Python Error Output

[Errno 2] No such file or directory: '/var/folders/j6/9thfps154wzblxd7qf8bzmwr0000gn/T/cdt_ges_a1452f34-d677-4e4e-bba1-51f028a98c9c/result.csv'


RuntimeError Traceback (most recent call last) Cell In [15], line 2 1 model = cdt.causality.graph.GES() ----> 2 output_graph = model.predict(data, new_skeleton) 3 print(nx.adjacency_matrix(output_graph).todense())

File /opt/homebrew/Caskroom/miniforge/base/envs/py310/lib/python3.10/site-packages/cdt/causality/graph/model.py:65, in GraphModel.predict(self, df_data, graph, kwargs) 63 return self.create_graph_from_data(df_data, kwargs) 64 elif isinstance(graph, nx.DiGraph): ---> 65 return self.orient_directed_graph(df_data, graph, kwargs) 66 elif isinstance(graph, nx.Graph): 67 return self.orient_undirected_graph(df_data, graph, kwargs)

File /opt/homebrew/Caskroom/miniforge/base/envs/py310/lib/python3.10/site-packages/cdt/causality/graph/GES.py:159, in GES.orient_directed_graph(self, data, graph) 149 """Run GES on a directed graph. 150 151 Args: (...) 156 networkx.DiGraph: Solution given by the GES algorithm. 157 """ 158 warnings.warn("GES is ran on the skeleton of the given graph.") --> 159 return self.orient_undirected_graph(data, nx.Graph(graph))

File /opt/homebrew/Caskroom/miniforge/base/envs/py310/lib/python3.10/site-packages/cdt/causality/graph/GES.py:143, in GES.orient_undirected_graph(self, data, graph) ... --> 224 raise RuntimeError("RProcessError ") from None 226 if not debug: 227 rmtree(base_dir)

RuntimeError: RProcessError

Jiuyue213 commented 1 year ago

I‘m having the same issue,so how should I do?
Below are all the bug I meet running in jupyterbook, and I also attach my code and my env packages below. please help me, thanks.

my code:

import cdt
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
from cdt.causality.graph import PC

# Load the data
data = pd.read_csv('http://www.causality.inf.ethz.ch/data/lucas0_train.csv')

# Infer the causal diagram
pc_output = PC().create_graph_from_data(data)

# Visualize the diagram
nx.draw_networkx(pc_output)
plt.show()

my bug:

R Python Error Output 
-----------------------

[Errno 2] No such file or directory: 'C:\\Users\\Jiuyue\\AppData\\Local\\Temp\\cdt_pc_dccede31-26aa-4f57-a35b-5aeb44deaf6b\\result.csv'
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [13], line 11
      8 data = pd.read_csv('http://www.causality.inf.ethz.ch/data/lucas0_train.csv')
     10 # Infer the causal diagram
---> 11 pc_output = PC().create_graph_from_data(data)
     13 # Visualize the diagram
     14 nx.draw_networkx(pc_output)

File D:\anaconda3\envs\dowhy\lib\site-packages\cdt\causality\graph\PC.py:278, in PC.create_graph_from_data(self, data, **kwargs)
    275 self.arguments['{NJOBS}'] = str(self.njobs)
    276 self.arguments['{VERBOSE}'] = str(self.verbose).upper()
--> 278 results = self._run_pc(data, verbose=self.verbose)
    280 return nx.relabel_nodes(nx.DiGraph(results),
    281                         {idx: i for idx, i in enumerate(data.columns)})

File D:\anaconda3\envs\dowhy\lib\site-packages\cdt\causality\graph\PC.py:315, in PC._run_pc(self, data, fixedEdges, fixedGaps, verbose)
    313 except Exception as e:
    314     rmtree(run_dir)
--> 315     raise e
    316 except KeyboardInterrupt:
    317     rmtree(run_dir)

File D:\anaconda3\envs\dowhy\lib\site-packages\cdt\causality\graph\PC.py:310, in PC._run_pc(self, data, fixedEdges, fixedGaps, verbose)
    307     else:
    308         self.arguments['{E_EDGES}'] = 'FALSE'
--> 310     pc_result = launch_R_script(Path("{}/R_templates/pc.R".format(os.path.dirname(os.path.realpath(__file__)))),
    311                                 self.arguments, output_function=retrieve_result, verbose=verbose)
    312 # Cleanup
    313 except Exception as e:

File D:\anaconda3\envs\dowhy\lib\site-packages\cdt\utils\R.py:221, in launch_R_script(template, arguments, output_function, verbose, debug)
    219     print("\nR Python Error Output \n-----------------------\n")
    220     print(e)
--> 221     raise RuntimeError("RProcessError \nR Process Error Output \n-----------------------\n" + str(err, "ISO-8859-1")) from None
    222 print("\nR Python Error Output \n-----------------------\n")
    223 print(e)

RuntimeError: RProcessError 
R Process Error Output 
-----------------------
Loading required package: momentchi2
Loading required package: MASS
Error: invalid multibyte character in parser at line 1
Execution halted

below are all my packages and their versions in my env.

asttokens          2.0.8
backcall           0.2.0
cdt                0.6.0
certifi            2022.9.24
charset-normalizer 2.1.1
colorama           0.4.6
contourpy          1.0.6
cycler             0.11.0
debugpy            1.6.3
decorator          5.1.1
dowhy              0.8
entrypoints        0.4
executing          1.2.0
fonttools          4.38.0
GPUtil             1.4.0
idna               3.4
ipykernel          6.17.1
ipython            8.5.0
jedi               0.18.1
joblib             1.2.0
jupyter_client     7.3.5
jupyter-core       4.11.1
kiwisolver         1.4.4
matplotlib         3.6.2
matplotlib-inline  0.1.6
mpmath             1.2.1
nest-asyncio       1.5.5
networkx           2.8.8
numpy              1.23.4
packaging          21.3
pandas             1.5.1
parso              0.8.3
patsy              0.5.3
pickleshare        0.7.5
Pillow             9.3.0
pip                22.2.2
prompt-toolkit     3.0.31
psutil             5.9.2
pure-eval          0.2.2
pydot              1.4.2
Pygments           2.13.0
pyparsing          3.0.9
python-dateutil    2.8.2
pytz               2022.6
pywin32            305
pyzmq              24.0.1
requests           2.28.1
scikit-learn       1.1.3
scipy              1.9.3
setuptools         65.5.0
six                1.16.0
skrebate           0.62
stack-data         0.5.0
statsmodels        0.13.5
sympy              1.11.1
threadpoolctl      3.1.0
torch              1.10.0
tornado            6.2
tqdm               4.64.1
traitlets          5.5.0
typing_extensions  4.4.0
urllib3            1.26.12
wcwidth            0.2.5
wheel              0.37.1
wincertstore       0.2
chenxiachan commented 11 months ago

My experience is: make sure all columns in your dataframe are in type of float/int. Avoid involving object/str types.