InvariantsMiner Optimisation

For datasets with a large number of log keys, InvariantsMiner has been exceptionally slow. I performed tests with a linux syslog dataset (415 log keys) and fitting times have been unbearable.

I profiled InvariantsMiner and detected that the (by far) largest amount of time is spent in the method _join_set. I optimised this method in order to reduce its computational complexity.

Now, runtimes are considerably better for linux syslogs. For HDFS logs, runtimes didn't change.

logpai / loglizer

InvariantsMiner Optimisation #95