[cg] Exploit symbolic analysis

eclipse-ocl-bot commented 1 month ago

| --- | --- | | Bugzilla Link | 575412 | | Status | NEW | | Importance | P3 normal | | Reported | Aug 15, 2021 02:52 EDT | | Modified | Nov 09, 2022 08:17 EDT | | Depends on | 520440 | | See also | 516751, Gerrit change 196697 | | Reporter | Ed Willink |

Description

The Bug 520440 SymbolicAnalysis allows almost all try/catch code to be eliminated since the possibility of failure is eliminated at edit-time.

But the thorough manual

context IterateExp\ inv SafeSourceCanBeNull:\ (isSafe and (ownedSource <> null) and (ownedSource.type <> null)) implies ...

can be minimized as

context OCLExpression\ inv HasType: type <> null

context IterateExp\ inv HasSource: ownedSource <> null\ inv SafeSourceCanBeNull:\ isSafe implies ...

and found to be sound through global edit/compile-time symbolic evaluation.

However SafeSourceCanBeNull can crash if executed in isolation possibly by narrow selection in an Ecore Editor.

a) a CGed body has dependencies that some validator manager must execute first\ b) the mayBeInvalid guards are folded in resurrecting the manual code\ c) the mayBeInvalid guards are folded in with fast fail returns

If folding in is inline the code is bloated.

If folding in is by out-of-line, possibly cached, call we need to use an OCL friendly API within the EMF API.

Out-of-line incurs the hazards of instance-cycles that may be difficult to detect; type-cycles seem inevitable.

c) avoids the bloat and the cache emulates a lazy a).

In terms of user interaction, we presumably want to generate errors for the primary failures and warnings/nothing for the consequential failures.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Aug 15, 2021 03:13

Bug 516751 has been marked as a duplicate of this bug.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Aug 15, 2021 03:19

A revision of the code generation to exploit the new CommonSubexpressionAnalysis should be able to better place the CSE caches.

crash hazards are variously eliminated by local proof that the explicit guards are adequate, or use of an implicit guard coming from another invariant. The former require no special action, the latter can check the guarding invaraint at the CSE point and fast return on fail; no need for major control flow weaving.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Aug 19, 2021 02:33

(In reply to Ed Willink from comment #0)

If folding in is by out-of-line, possibly cached, call we need to use an OCL friendly API within the EMF API.

There seems little alternative to an outer EMF API validation method and an inner OCL-friendly method. As a minimum the outer discovers/creates the OCL validation manager and redirects to the inner.

The simplest implementation may involve an instance2constraint2status cache that is lazily populated as each new transitive constraint is needed. Potentially heavy, given that many of the transitive constraints may be really simple, but we need to avoid diagnosing twice and deferring deduplication till post diagnosis is also heavy.

The smartest approach may involve a validation schedule that guarantees that all transitive validations occur in order. (A partial validation may require the appropriate partial validation schedule to be identified.) This eliminates the space and maintenance of an instance2constraint2status cache in favour of a type-based table structure, but requires each constraint to have a variant execution in which it identifies its transitive validation calls. This may often be trivial but in the case of complex collection operations, or worse allInstances(), may be really horrible amounting to a perhaps 90% re-execution.

Perhaps a constraint2instance2status cache could support multiple constraint strategies and avoid caching for the many constraints that are not accessed transitively and so cannot be evaluated twice.

The horrible case mandates lazy checking within the inner method, so support that first with instant warning return for a transitive validation failure.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Aug 22, 2021 02:25

We need to distinguish two use cases.

Validation. Execution must tolerate and accurately diagnose ill-formed models.

Execution. Input should be well-formed, even validated, so redundant checking is bloat.

It was the offensive bloat for QVTr that motivated the invalidity analysis.

One size of CG doesn't seem to fit all.

The current CG style could be termed maximal CG; a non-trivial hierachy of try-catch handles every possible invalid as an inline possibility. Too bloated for execution.

The fully optimised CG style could be termed minimal CG; all checks are omitted on the assumption that something earlier has already guarded. If the input is not well-formed, Java NPE/CCE/IOOBE etc happen. Intolerant of ad hoc invocation, needs a validation schedule.

We need a middle way; a strict CG. For normal/strict usage, for which the symbolic analysis detects no may-be-invalid terms, we can omit the try-catch hierarchy, but retain the diagnosis friendly NPE/CCE/IIOBE guards. When any guard fails there is an instant return via a thrown InvalidValueException that will likely propgate straight out to the OCL invocation. The guards are fairly simple if-tests so incur small execution time cost and modest space cost through message construction. Use of "and" may be redirected to "and2" at compile-time.

For abnormal/semi-strict uasge, we must retain maximal CG. Abnormal/semi-strict may be the result of using "invalid" literal, an oclIsInvalid() or oclIsUndefined() operation call or failing to recode to avoid may-be-invalid in the symbolic analysis. OCL tooling must tolerate use of invalid. QVTr can impose the stronger prohibition on invalid avoiding a need for maximal CG.

This clarifies the InvalidValueException API. IVE should be thrown when a strictness violation has occurred, e.g. a bad "and2" output. IVE should be passed by value when strictness is relaxed, e.g. a bad "and" input/output.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Feb 11, 2022 11:31

(In reply to Ed Willink from comment #2)

A revision of the code generation to exploit the new CommonSubexpressionAnalysis should be able to better place the CSE caches.

Accurate identification of non-trivial common globals such as Maps is trivial once the new bottom up path-based CommonSubExpressionAnalysis is exploited. Not worth elaborating the old structural hash code -based analysis.

eclipse-ocl-bot commented 1 month ago

Nov 09, 2022 08:17

New Gerrit change created: https://git.eclipse.org/r/c/ocl/org.eclipse.ocl/+/196697

eclipse-ocl / org.eclipse.ocl

[cg] Exploit symbolic analysis #2198

Description