Closed tiferrei closed 1 year ago
Dear @tiferrei, thank you for contributing. This looks like an interesting feature. However, I am a little bit conflicted about the implementation because you extend the existing IncrementalMealyBuilder
classes. By contract, these types of systems should only store consistent information. By sub-classing, you can provide an instance of your builder where a consistent cache is required.
Can you maybe provide some additional information about possible use-cases for this type of cache? Maybe we can then work out a separate concept/interface for these type of caches.
Hi @mtf90, sure. So the cache is still consistent. The cache simply removes conflicting information before adding new one, instead of throwing an exception.
The use case is caching learning information on systems that mutate over time, where priority is given to "fresher" information. Conflicting info is then just evidence the target system has mutated, and the previous information can no longer hold.
I agree this doesn't currently meet the current interface. I'm completely open to designing a better fitting one!
Funnily enough, your description of its behavior is exactly what I would call inconsistent :laughing:. Querying the same input may produce different outputs at different times. When used with the current algorithms of LearnLib this could cause problems, since most learning algorithms require a deterministic SUL behavior. Do you use a specialized learning algorithm for mutable systems with this?
What I'm currently wondering about is, how one would use this class as a cache. Normally, if you have a cache miss, the query is delegated to the system and the result is inserted into the cache. Repeated queries will then never reach the SUL again. Now, if you have mutating systems, you never know if the system has mutated, so you basically have to delegate all queries to the system to potentially override old data, rendering its role of a cache useless.
Also, Mealy caches are typically prefix-closed. Let's say you insert abc
-> xyz
and abd
-> xyu
and then override abc
-> zyx
. What would a look up of abd
return? zyu
? Are you assuming that only the transition outputs change or possibly the state transitions as well? If the latter, overwriting a trace would have to invalidate all its suffixes as well.
Hi, yes current LearnLib algorithms will not be able to handle these kind of mutating systems. We do have an algorithm for it, however we can't contribute our fork to learn lib until this cache is in place for it. In reality we use it more than solely as a cache, still one of the benefits it provides for us is still caching.
To answer the last question, abd
would not be in the cache, as inserting abc/zyx
would conflict at the input a
. The suffixes are invalidated as well.
I see. I assume you plan on using some of the other methods of the IncrementalMealyBuilder
interface (e.g., graph-based visualization and counterexample search) as well, right? Some time ago, we have removed the SuffixTrie class because it was unused. If you only need to store suffixes in a tree-style fashion, we could think about re-vitalizing that code and keep it simple.
Otherwise, I'm thinking about moving some interfaces around. I.e., making AbstractIncrementalMealyBuilder
no longer implement IncrementalMealyBuilder
directly but rather move it down to the actual implementations (IncrementalMealyDAGBuilder
, IncrementalMealyTreeBuilder
, DynamicIncrementalMealyTreeBuilder
).
This way, a potential AdaptiveIncrementalMealyTreeBuilder
could still inherit functionality from a shared super class but no longer has to implement the conflicting IncrementalMealyBuilder
interface. Instead we could think of a separate AdaptiveIncrementalMealyBuilder
interface with methods tailored more towards your use case. This would also keep the door open for other implementations (such as a DAG-based adaptive builder). What do you think?
Hmm we do make use of the usual methods of interaction like lookup, insert, etc. but also findSeparatingWord
, from the cex search part. When we started we just really wanted a way of building a tree-like mealy machine incrementally, but at the end the cex search functions turned out to be useful too.
I think the idea of the restructured interfaces is perfect! It does make things clearer, and sounds like the AdaptiveTree could live happily there
Would you then kick off this idea by providing an interface that contains all the methods that you deem relevant? For example, I saw that you need a slightly adjusted insert
method with a query index and want to publicly offer a conflict
check, etc.
Some remarks about the current implementation: As far as I understood, you use the stateToQuery
and queryStates
fields to keep track of the insertion order, especially when removing sub-trees. Should this be exposed via a public parameter (queryIndex
) or is this something to keep track of internally (e.g., via a counter)?
Also, do you think this is a fuctionality that a generic "AdaptiveMealyBuilder" should provide or is this something specific to your learning algorithm? Would it potentially be cleaner to do the book-keeping in some utility class in the learner's module and just offer the overriding/conflict checks in the adapter builder?
So some of these details we might actually not need. Let me clean this up a bit and I'll push a more minimal version of the interface, if we are to extract it into one.
I believe that at its core, the Adaptive version should be like the classic incremental builder, with the exception of never throwing conflict exceptions. Instead it fixes these by removing conflicting information.
So some of these details we might actually not need. Let me clean this up a bit and I'll push a more minimal version of the interface, if we are to extract it into one.
Great. I'll have look at the refactoring afterwards.
Just another brainstorm: Maybe it could be useful to return the sub-tree that would be / is overriden when inserting conflicting traces (either for the conflict
or insert
method) rather than a Boolean
(null
could be "no conflict"). Maybe this can encode some information and simplify query counting or something.
Just another brainstorm: Maybe it could be useful to return the sub-tree that would be / is overriden when inserting conflicting traces (either for the
conflict
orinsert
method) rather than aBoolean
(null
could be "no conflict"). Maybe this can encode some information and simplify query counting or something.
What do you think of making insert have the exact same signature (except the exception thrown) as the classic case? In theory the same use can be obtained by calling conflicts
before insert
. And in this case then conflicts
returns the tree / null case.
I'm fine with anything that leads to an elegant implementation. You know the requirements of your algorithm better than I do. I was just trying to highlight that we are not restricted by the "old" signatures.
If you have no use for the replaced sub-tree, we could also return the index of the first mismatch (so it may be easier to trace the change in the automaton) or just keep the boolean value if any additional information doen't matter at all.
For me specifically we just need to know if this new information causes a conflict or not. That being said, any of these options gives us this info, so I don't mind making them more information-heavy.
If you're happy with a boolean return value, we should use this then. We don't have to make it overly complex just for complexity's sake.
It was just an idea that maybe you could skip the conflict
method if you encode the conflict within the insert
result. But if you need a separate (read-only) conflict check, then two boolean methods are fine with me.
If you're happy with a boolean return value, we should use this then. We don't have to make it overly complex just for complexity's sake.
It was just an idea that maybe you could skip the
conflict
method if you encode the conflict within theinsert
result. But if you need a separate (read-only) conflict check, then two boolean methods are fine with me.
I don't think we need the 2 separate methods. If you are ok with insert
returning the Boolean itself, I can make that work without the conflict
. That being said, I can't implement the current interface because it demands a void
return type. With the extra conflict method, I can keep the signature and get the extra functionality from that method.
Well then that is the perfect moment to introduce the new interface and adjust the class hierarchy. I dabbled around a little bit and have a patch ready (see below). If you invite me as a collaborator, I can also push it to the repo directly.
Sounds great, I have added you!
You should now be able to adjust the AdaptiveMealyBuilder
interface according to your needs. I have already switched the insert
method to return a boolean value. To activate the changes, you only need to change the AdaptiveMealyTreeBuilder
from implements IncrementalMealyBuilder<I, O>
to implements AdaptiveMealyBuilder<I, O>
. This should not affect any other classes.
The only (potentially breaking) change that I have introduced so far is that the insert method now expects an output Word
rather than an output List
.
If you are happy and adjusted everything according to your needs, I can have another look at some more cleanup work (extracting shared functionality, making code-analysis pass, etc.)
Ah, hmm. I believe I can't currently extend AbstractIncrementalMealyTreeBuilder
and implement AdaptiveMealyBuilder
at the same time due to the conflicting return types of insert
. I guess that makes sense and these two shouldn't be linked. Copying out the remaining methods like lookup, etc. does mean some duplication though...
Sorry, I didn't think the last step through. I added some of the cleanup stuff already (mainly an intermediate abstraction layer to share some more functionality). It should be easier now to swtich to the adaptive interface.
I have pushed the switch!
Great! So the next step would be to check if you need any additional methods for your algorithm or if you even can remove some unnecessary stuff (if I recall correctly, your book-keeping of oldest queries was somewhat optional?).
Given the new AdaptiveMealyBuilder
interface, you can decide what functionality should be publicly exposed. Everything else, I'd be happy to strip from the implementation.
Let me know if you feel that all requirements are met, so that I'll have a final look at cleaning up before merging. If in the meantime you have any questions, feel free to ask.
So for our specific algorithm we do make use of retrieving the oldest query info. I have removed the info from the method signature and into an internal counter, as in our case we can assume these are strictly increasing, where the most recently inserted info is the freshest.
In that sense, if we were to, for example, make a AdaptiveMealyDAGBuilder
, it should have that function too. So I think it makes sense to build it into the interface. I will build it into the interface and push.
Maybe getInputAlphabet
should also move down to a more core interface?
Maybe
getInputAlphabet
should also move down to a more core interface?
getInputAlphabet
in introduced by the separate InputAlphabetHolder<I>
interface. This isn't anything MealyBuilder
specific.
Ah yes I see. In that case I believe it is done, requirements wise. Let me know if you'd like me to do any further changes, and thank you for the help!
I refactored some of the previous cleanups. Mainly, this was just moving the alphabet related code into a shared super class. We have to do some code copying because java doesn't support multiple inheritance. But I rather copy just the insert
method than all of the alphabet related stuff. This makes the IncrementalMealyTreeBuilder
and AdaptiveMealyTreeBuilder
classes much cleaner.
Furthermore, I did refactor some parts of the adaptive implementation:
getOldestInput
method now returns a straight Word<I>
because I think it is bad library design to have user-land deal with wildcards.LinkedHashMap
to store the queries. Since a LinkedHashMap
iterates over elements in insertion-order, we can skip the manual book-keeping of ages and counters. According to your tests, everything works as expected. Does this correctly reflect your intended behavior or did I miss something?If you have nothing more to add, I'm happy with merging.
This sounds fantastic! Indeed the tests cover my intended behaviour, and I also just compiled and tested things locally on the algorithms and all runs smoothly. I have nothing more to add and happy with merging too.
Hi there,
In a recent paper to be published we make use of a type of Tree that allows for conflicts. As such I'd like to contribute our extension of Incremental Mealy Tree Builders.
Please let me know if there are any design choices, improvements, or questions you may have to merge this in.
Thank you!
Tiago