lokijuhy / data-traffic-control

Whhrrrr... Voooooosh... That's the sound of your data coming and going exactly where it belongs
MIT License
0 stars 0 forks source link

Merging SAD #26

Open daniel-trejobanos opened 3 years ago

daniel-trejobanos commented 3 years ago

Hi, I wonder if there is a better way to do this:

    x_ss = dm[out_path]['sstats.parquet'].load()
    x_ld = dm[out_path]["lderiv.parquet"].load()
    x_cs = dm[out_path]["complexity.parquet"].load()
    x_sp = dm[out_path]["spectrum.parquet"].load()
    y = dm[out_path]["labels.parquet"].load()

    features = [x_ss.data, x_ld.data, x_cs.data, x_sp.data]
    # merge the data frames in a single feature matrix
    logging.info('Merging features')
    x = reduce(lambda  left,right: pd.merge(left,right,on=['CaseID', 'ChunkID'],
                                            how='outer'), features)

i.e. I am merging four sads, but I merge their data frames, thus losing their trees, how would I go around it using sad.transform() ?