Closed ceesu closed 5 years ago
a) A measured confounder is a common cause that is present in the dataset. To tell whether L1, L2 are discovered as confounders, you can simply analyse the causal graph and see whether there are causal relationships going from L1, L2 to A and B.
b) TCDF does not give a perfect causal graph, as shown by the experiments in the paper. It is therefore not possible to tell beforehand whether TCDF will discover all causal relationships between L1, L2, A and B.
Thanks very much for your response. It seems like when I include the measurements of L1, L2 within the dataset, TCDF can correctly identify them as causes; but when I do not, it does not seem to identify that there are hidden confounders despite the fact that the time series are not so long (250 steps). But perhaps it's because my current dataset has many variables.
Given your response I would like to try it with a range of number of layers, as well as perhaps the finance dataset, and see what this does but for this purpose I have small follow up questions:
filename="file.csv" %run -i "runTCDF.py" --data filename
It seems the program will try to find the file "filename" instead of "file.csv".
As explained in section 4.3.2, TCDF concludes that there exists a hidden confounder (i.e. not included in the dataset) when it discovers a 2-cycle between the hidden confounder's effects, both with delay 0 (see Fig 8b). TCDF will not draw the hidden confounder as a node in the graph but the user itself can conclude that there should be a confounder. As a side note, our experiments showed that TCDF performs better on long time series (see Table 4) so 250 time steps might be a little short.
Regarding your other questions: 1) Saving the graph output and text output separately is not explicitly supported in the current implementation. However, you could add a line of code in TCDF to do this. For example, add your own code at line 253 in runTCDF.py to write the discovered causal relationships to a file. You can then choose your own plotting library to read this file and visualize the graph. Other visualization toolboxes offer more functionality, such as resizing the figure. 2) I think that's not possible. The hacky way of doing it is to save your filename somewhere as a string and copy-paste it everytime you want to run TCDF ;) However, please note that you can run TCDF on multiple datasets by giving a list of filenames as argument, separated by commas.
Thanks very much for your reply. I switched to running a 1200 time step dataset. Unfortunately it doesn't seem to discover any 2-cycles in this case either... I will try to use your suggestions for saving the plot and running multiple files though. This is very helpful!
Hello, thanks very much for your work on this interesting project. I would like to try TCDF following the 'Finance hidden' example in your paper but I have a few questions. Currently I am working with an independent dataset that I have made.