Open sarahelsaig opened 2 years ago
production could receive a different implementation that actually uses that technical name in a sanitized form
What do you mean by this, exactly? Isn't the current implementation what production always needs?
We could indeed have a different or even a pretty much no-op implementation for development.
For example we have this line in the service which creates an intermediate variable from a conditional expression.
For the sake of verification testing it would be better if it created conditional1
for the first usage instead of conditional490fbbe6120ff28...
. Since the transformation is executed sequentially, the numeric IDs should be the same deterministically unless the code is meaningfully changed. This is very useful because you will easily notice when more or less identifiers are created compared to the reference.
For creating new code to be transformed, the full content of astNode.GetFullName()
is actually useful in identifying what kind of expressions they reference. So instead of a lossily generated ID, it should take the expression name (System.Void Hast.Samples.SampleAssembly.GenomeMatcher::FillTable(Hast.Transformer.Abstractions.SimpleMemory.SimpleMemory).((int)(num6) != (int)(0) && (int)(num7) != (int)(0)) ? 2 : 3[242..244)
) and escape it somehow to be a valid C# name, or url encode it.
AstNodeExtensions.GetFullName()
has a similar logic where it creates a globally unique (mostly human-readable) identifier for any node, even if it doesn't have an explicit name. Keep in mind that the signature plus body of a member is not necessarily a globally unique identifier.
Also keep in mind that even if e.g. conditional names needn't necessarily be globally unique, they need to be unique in their own scope. Due to inlining, this scope can change, and by the end of transformation be much larger than that of the initial expression.
Since transformation happens in a parallelized fashion per member, I don't think sequential IDs can be assigned in a deterministic fashion unless you somehow pre-assign them based on the structure of the AST.
To be clear I don't want to reinvent/replace GetFullName()
. I only want make a better use of its output instead of feeding it into a hash function.
Since transformation happens in a parallelized fashion per member
Do you mean that the AST visitor operates in parallel? I knew the decompilation is parallelized but I thought the individual transformers worked sequentially.
Yep, I see. Was just referring to your second paragraph but on a second read, I was beside the point.
TransformedVhdlManifestBuilder.TransformMembers()
works in a parallelized way but ConditionalExpressionsConvertingVisitor
and similar ones don't indeed. I was thinking about the first one.
Thanks for the info, I will keep it in mind.
Testing VHDL correctness during any core development quickly becomes cumbersome, as labels like
conditional490fbbe6120ff28cfa73a92f6fc4f384ae723eb9ad8d460306095fada72ff872
change all the time. This is becuase the hash value is derived from a technical name that contains line numbers and the full caller name.I believe a lookup-based approach (e.g. service class with
ConcurrentDictionary
) that assigns sequential IDs would be more resistant to inconsequential changes during testing. On the other hand, production could receive a different implementation that actually uses that technical name in a sanitized form, that could be helpful in debugging new code. To achieve this selection,Sha256Helper
should be replaced with anIVhdlLabelProvider
service with the mentioned two implementations.Jira issue