FenTechSolutions / CausalDiscoveryToolbox

Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html
MIT License
1.12k stars 197 forks source link

LiNGAM for bivariate case #67

Open sAviOr287 opened 4 years ago

sAviOr287 commented 4 years ago

Hi

I have been trying to use LiNGAM in the bivariate case, however, the current implementation does seem to support this as it seems to only work on graphs for now. Ideally, I would like to use it the same way as in https://arxiv.org/pdf/1711.08936.pdf

The only difference I currently see is the dataloader:

For the graph example you have:

def load_sachs(**kwargs):
    dirname = os.path.dirname(os.path.realpath(__file__))
    return (pd.read_csv('{}/resources/cyto_full_data.csv'.format(dirname)),
            read_list_edges('{}/resources/cyto_full_target.csv'.format(dirname)))

whereas for the Tuebingen dataset you have:

def load_tuebingen(shuffle=False):
    dirname = os.path.dirname(os.path.realpath(__file__))

    data = read_causal_pairs('{}/resources/Tuebingen_pairs.csv'.format(dirname), scale=False)
    labels = pd.read_csv('{}/resources/Tuebingen_targets.csv'.format(dirname)).set_index('SampleID')

    if shuffle:
        for i in range(len(data)):
            if random.choice([True, False]):
                labels.iloc[i, 0] = -1
                buffer = data.iloc[i, 0]
                data.iloc[i, 0] = data.iloc[i, 1]
                data.iloc[i, 1] = buffer
    return data, labels

I was wondering if there is an easy fix to make LiNGAM also work on the pairwise case.

i seem to get this error:

FileNotFoundError: File b'/tmp/cdt_LiNGAMa88a58e7-358a-439b-9948-5fde62654c50/result.csv' does not exist

Thanks a lot in advance for your help

Best

diviyank commented 4 years ago

Hello,

Sure i'll try to adapt the LiNGAM for the bivariate case real soon ! the error just shows that the R-process errored.

Best regards, Diviyan

sAviOr287 commented 4 years ago

Hi,

Great!

Is there a hint you could give me to check what is going wrong? It seems like results.csv is not being created

I am trying to get LiNGAM as a baseline method atm.

Sorry for the inconvenience.

Best

diviyank commented 4 years ago

Hi, It comes from the data that is not given in the good shape to the R process. One way to solve this problem is to format the data as a 2-variable graph.

We are quite busy during this month so we will look into this in July.

Best, Diviyan