AlexWorldD / NetEmbs

Framework for Representation Learning on Financial Statement Networks
Apache License 2.0
1 stars 1 forks source link

Images #12

Closed boersmamarcel closed 5 years ago

boersmamarcel commented 5 years ago

Hi aleksei,

I’ve tried some stuff with the plots and I think it has to do with the marker size. Resizing to 5 from 150 produced better plots already.

We can play around with that to see what the right parameters are.

AlexWorldD commented 5 years ago

I guess one of the possible problems was in the next line of code df = df.merge(embds, on="ID"), actually it concated our found embedding to journal entries rather than to Business Process, as a result, the output column Emb included a lot of duplicates... Now I modified it to

# //////// Merge with GroundTruth \\\\\\\\\
if MODE == "SimulatedData":
        d = add_ground_truth(embds)
if MODE == "RealData":
        d = embds.merge(d.groupby("ID", as_index=False).agg({"GroundTruth": "first"}), on="ID")