bishoksan / DFTA

Determinisation and Completion of Finite Tree Automata
2 stars 1 forks source link

Add more diverse examples #5

Closed modulovalue closed 6 months ago

modulovalue commented 6 months ago

Hello @jpgallagher,

The tree automata that DFTA comes with are heavily biased towards small alphabets (i.e., few distinct symbols) and symbols with low arities.

This PR contains 2 TA that are more diverse.

AT2 is a smaller version of AT1.

The current optimized approach seems very inefficient with AT1. I was able to fix its performance by, during each iteration of the worklist, only considering the symbols that can actually occur there.

Consider:

https://github.com/bishoksan/DFTA/blob/4fd3d06d81681a12aebf201c0de2a930d5fa6749/src/dfta/DeterminiserOpt.java#L132-L137

Notice how all sigmas are being looked through. This can be redundant as we can precisely precompute which sigmas are relevant for each state (I call them productive symbols) we can find them by only considering those sigmas where any state that we are currently considering is occurring in an argument position of a symbol. All symbols that do not have any of the states that we are considering in any of their arguments are not productive and can be ignored.

I wanted to share these example TA with you in case you ever intend to continue working on optimizing TA determinization. This optimization was necessary to make the use case that we've discussed via mail work in a reasonable amount of time.

jpgallagher commented 6 months ago

Hi @modulovalue I have not thought about your suggestion deeply but it sounds like a very interesting optimisation. Thanks for the examples. Our paper discusses a little bit which kind of FTAs may perform badly with our algorithm (and the randomly generated FTAs in our experiments performed generally much worse). But some preprocessing such as you suggest might help to alleviate that.