eclipse-ocl / org.eclipse.ocl

Eclipse Public License 2.0
0 stars 0 forks source link

[projectmap] Improve performance #2075

Open eclipse-ocl-bot opened 3 days ago

eclipse-ocl-bot commented 3 days ago

| --- | --- | | Bugzilla Link | 548848 | | Status | NEW | | Importance | P3 normal | | Reported | Jul 02, 2019 04:25 EDT | | Modified | Jul 11, 2019 09:30 EDT | | Depends on | 549008 | | See also | 548796, 549009 | | Reporter | Ed Willink |

Description

The StandaloneProjectMap/ProjectMap establishes the

nsURI => Java EPackage class (from the plugin.xml org.eclipse.emf.ecore.generated_package extension point)

nsURI => *.ecore EPackage model element (from the plugin.xml org.eclipse.emf.ecore.generated_package extension point, referenced genmodel ecorePackages, referenced ecore model element (Bug 548796))

and so enables ResourceSet, URIMap, PackageRegistry to be initialized to suppress metamodel schizophrenia.

StandaloneProjectMap/ProjectMap currently does both the nsURI mapping discovery and the ResourceSet configuration, with very limited re-use of an earlier discovery for yet another ResourceSet configuration.

Separate the discovery into a {Standalone/Eclipse}EPackageLibrary, with the static instance of StandaloneEPackageLibrary responding to 'manual' add/remove classpath entry and the EclipseEPackageLibrary running as a Job in response to automatic IRegistryChangeEvents.

For Standalone we can typically just add and add and add more classpath entries as JUnit tests proceed. Very rarely there can be removes.

For Eclipse/OSGI, there should just be one cost across all applications; a significant benefit to e.g Papyrus. Only question is whether to start the EPackageLibrary job eagerly. No. Must impose no cost on non-OCL users. If an OCL builder provokes a validation then that will trigger the job fairly quickly.

For Eclipse/OSGI, plugin.xml reading is replaced by ExtensionPoint listening. Only .genmodel reading is mandatory. The .ecore reading for Bug 548796 can be lazy.

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 03, 2019 03:41

While debugging a few further performance issues arise.

The EMF URIMap class uses linear lists and consequently is horrible for large maps involving all projects.

Reinitializtion of a URIMap entry removes it (involving a full length ripple) and then adds it again.

The URIMap appears to be being initialized more than once at quadaryic cost in the number of projects.

URIMap entries seem to be for all projects rather than just those with genmodels.

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 03, 2019 12:42

(In reply to Ed Willink from comment #1)

The URIMap appears to be being initialized more than once at quadaryic cost in the number of projects.

Moderately easy to avoid re-init for the non-global maps. Global re-init should be fixed once there is a tracking global map.

URIMap entries seem to be for all projects rather than just those with genmodels.

Initializing only projects with resources saves 75% of the entries but 15 tests fail. Instrumenting:

700 standalone tests (15 failures)\ 1039 StandaloneProjectMap.initializeURIMap calls\ 27014 URI entries actually initial\ 122677 Projects available for initialization

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 03, 2019 16:53

(In reply to Ed Willink from comment #2)

URIMap entries seem to be for all projects rather than just those with genmodels.

Initializing only projects with resources saves 75% of the entries but 15 tests fail.

e.g testDelegates_Import_476968 which accesses\ org.eclipse.ocl.examples.project.royalandloyal/model/RoyalAndLoyal.ecore

There is no plugin.xml (or extension point)\ There is no org.eclipse.emf.ecore dependency

There is an OCL nature\ There is a model directory

But we cannot rely on these to identify a model path element. Pruning URI map entries would be a breaking change. We'll have to lib=ve with the many entries and try to install a faster registry.

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 05, 2019 06:57

(In reply to Ed Willink from comment #3)

We'll have to live with the many entries and try to install a faster registry.

Instrumenting the URIMappingRegistryImpl for the 708 standalone Pivot OCL JUnit tests reveals some staggeringly high counts.

The global registry:\ 63309031 calls to URI.replaceWith\ 1790098 attempts to replace a prefix\ 936453 no-replacements

1970 local registries deegating to the gllobal

Overall\ 1440 hits (lookup of an exact URI mapping)\ 80205615 calls to URI.replaceWith\ 3630788 attempts to replace a prefix\ 2726551 no-replacements

Very few getURI's are resolved by direct hits, yet the calls are very very repetitive. THe XText serializer seems to make EcoreUtil.getURI work hard with a registry normalization each time.

Why not cache a successful prefix lookup for use as a direct hit?

Overall\ 1733925 hits - 100 times more\ 1809643 calls to URI.replaceWith - 44-fold saving\ 19803 attempts to replace a prefix - 183-fold saving\ 18520 no-replacements - 147-fold saving

All tests still pass.

Typical use patterns for local registries are resolution of about 5 URI misses followed by 100 URI hits. Tests using UML have more like 18000 hits

However not all acceleration can be done without changing EMF.

For instance loading UML involves a loadPackage that uses the defaultURIConverter, which could have 1194 hits if it cached resolutions. Once validated and proxies resolved, 17315 hits could have been cached. Saving the resource has another 2000 hit opportunities.

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 05, 2019 15:11

(In reply to Ed Willink from comment #4)

However not all acceleration can be done without changing EMF.

See Bug 549008 and Bug 549009

eclipse-ocl-bot commented 3 days ago

By Ed Willink on Jul 11, 2019 09:30

(In reply to Ed Willink from comment #5)

(In reply to Ed Willink from comment #4)

However not all acceleration can be done without changing EMF.

See Bug 549008 and Bug 549009

Discussion on Bug 549008 demonstrates the extreme difficulty of 100% compatible behavioural evolution. Probably going to be WONTFIX, but the discussion prompts some improvements in a local FasterURIMappingRegistryImpl.

For now thread safety is not necessary, but a simple option to use Collections.synchonizedMap for FasterURIMappingRegistryImpl.remappedURIs provides a future thread safety capability.

Since local only, garbage collection/leakage is not a concern. Everything should clean up when the ResourceSet goes obsolete. No need for WeakHashMap or associated expunge synchronization.

Suppression of re-put churning must use equals rather than == because Fragment URIs are not interned.

Once there is a previous resolutioons cache, there is insufficient benefit in the further binary search optiization.

Ensuring that FasterURIMappingRegistryImpl.initilaizeREsourceSEt() is called allows 950% of URI resolutions in the standalone JUnit tests to be 'faster'.

Problems that cannot be faster are:

(Generate OCL Pivot model needed a tweak to ensure that late UML URI maps are registered locally.)

Overall the new FasterURIMappingRegistryImpl.getURI() does the local

1678404 re-use hits\ 5 non-prefix hits\ 1841 prefix resolutions\ 9161 unmapped\ =1689411 over 99% re-use hits\ using 1385604 replacePrefix calls for 1841+9161 candidates\ -- 120 replacePrefixes per candidate

whereas the traditional EMF URIMappingRegistryImpl.getURI() does the residual

0 non-prefix hits\ 31374 prefix resolutions\ 75557 delegated hits\ =106931\ using 390154 replacePrefix calls for 31374+75557 candidates\ -- 4 replacePrefixes per candidate

120 replacePrefixes per candidate is a bit high

re-instating the binary search:

FasterURIMappingRegistryImpl 1679002+5+1855(37733)+9196=1690058 http://www.eclipse.org/ocl/2015/EssentialOCLCS.oclas

FasterURIMappingRegistryImpl 1678605+5+1856(37738)+9197=1689663

http://www.eclipse.org/ocl/2015/CompleteOCLCS.oclas\ 1678605 re-use hits\ 5 non-prefix hits\ 1856 prefix resolutions\ 9197 unmapped\ =1689663 over 99% re-use hits\ using 37738 replacePrefix calls for 1856+9197 candidates\ -- 3.4 replacePrefixes per candidate

(totals are not 100% determinstic!)