AlexWorldD / NetEmbs

Framework for Representation Learning on Financial Statement Networks
Apache License 2.0
1 stars 1 forks source link

transition probabilities zero #4

Closed boersmamarcel closed 5 years ago

boersmamarcel commented 5 years ago

Hi Aleksei,

I'm running it on multiple datasets and sometimes I get the following error:

Traceback (most recent call last):
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/analysisMB.py", line 52, in <module>
    simdata = similar(d, direction=["COMBI"])
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 502, in similar
    direction=direction)
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 418, in find_similar
    direction=_dir), top=top_n, title=str(ver + "_" + _dir))
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 318, in get_pairs
    range(walks_per_node) for node
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 319, in <listcomp>
    in fsn.get_BP()]
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 257, in randomWalk
    new_v = step(G, cur_v, cur_direction, mode=2, return_full_step=return_full_path, debug=debug)
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-master/NetEmbs/FSN/utils.py", line 193, in step
    tmp_vertex = np.random.choice(outs, p=ws)
  File "mtrand.pyx", line 1144, in mtrand.RandomState.choice
ValueError: probabilities contain NaN

Process finished with exit code 1

I haven't found a cause of this, for some datasets it fails, for some it passes. However, the rate for now is approximately 50/50....

I will continue with the analysis and run it for examples where it does work, the results up to now are good. I find interesting groups of transactions although I'm now only studying it for the combi approach because the others seem to give weird results.

AlexWorldD commented 5 years ago

I've added a logger inside step function:

 if LOG:
                snapshot = {"CurrentNode": tmp_vertex, "CurrentWeight": tmp_weight,
                            "NextCandidates": list(zip(outs, ws)), "Probas": probas}
                local_logger = logging.getLogger("NetEmbs.Utils.step")
                local_logger.error("Fatal ValueError during step", exc_info=True)
                local_logger.info("Snapshot" + str(snapshot))

hope, it'll help to identify the problem. In my case I got the same error due to 0.0 in simulated values (for EBPayables).