eclipse-qvto / org.eclipse.qvto

Eclipse Public License 2.0
0 stars 0 forks source link

QVTo uses excess memory #963

Open eclipse-qvt-oml-bot opened 1 day ago

eclipse-qvt-oml-bot commented 1 day ago

| --- | --- | | Bugzilla Link | 492326 | | Status | NEW | | Importance | P3 normal | | Reported | Apr 24, 2016 14:16 EDT | | Modified | Apr 25, 2016 05:16 EDT | | Reporter | Ed Willink |

Description

I've just been preparing some performance benchmarks for a paper for BigMDE 2016 and been surprised at how few model elements QVTo can cope with.

In a 256MB VM, a simple transformation test that is little more than a memory copy:

Manual copy fails at 1,000,000 elements\ EcoreUtil copy fails at 500,000 elements\ QVTc interpreted fails at 1,000,000 elements\ QVTc code generated fails at 500,000 elements

But QVTo fails at around 150,000 elements; a factor of five away from what would be expected, given that the performance is scaling well with models size and is otherwise almost identical to QVTc interpreted.

Use VisualVM to get a heap dump, there are huge, equal, numbers of EValues, VapParmeterValueImpl and EObjectContainmentELists. At the time of the dump huge was 426000.

It seems that the memory required for the trace is much larger than that required for inputs and outputs.

Suggest redefining, a possibly non-EMF, model to capture the trace compactly. If required this can be exported as an EMF trace resource.

[QVTc performs a usage analysis and so only incurs re-invocation guard overheads for the fewer than 10% of mappings that actually require a trace.]

eclipse-qvt-oml-bot commented 1 day ago

By Sergey Boyko on Apr 24, 2016 15:23

Hi Ed,

Definitely the case to investigate and improve.

Could you please attach QVTo script(s) that you use for testing.

Best Regards,

eclipse-qvt-oml-bot commented 1 day ago

By Ed Willink on Apr 25, 2016 05:16

The tests are invoked by

GIT\org.eclipse.qvtd\tests\org.eclipse.qvtd.doc.bigmde2016.tests.launches\QVTo -BigMDE2016 - Families2Persons (256MB).launch

The EValue design is perhaps convenient, but it's bloated. Now that we use 64 bit machines every thing costs 8 bytes, so the bytes just fly away. Even a HashMap$Node is 48 bytes!

IIRC from when I was writing my QVT TRaceability paper, I was a bit shocked by the use of a map to correlate parameters/arguments where list positions would have been adequate. There is therefore probably a very low hanging factor of two to be had on the current design. But API breaking.


QVT 1.3 clarifies the requirements for internal (resolve) traceability. This could be an opportunity to align with it.


http://issues.omg.org/browse/QVT14-15 highlights the total naivety of the claim that QVTo can support incremental execution. If QVTo really is to support incremental, then a major development is required. Alternatively if the only practical approach to incremental support is a QVTo2QVTc transformation to exploit the QVTc analyses, then the current persisted QVTo trace might be re-visited to see whether it satisfies debug / profiling analyses.


I think there is an interesting compact representation for models in general that uses a serialization of 2-byte values, which, in UTF-style, codes as 15 bit values per element with the sixteenth bit signalling the need for a further 15 (or more) bits from the next 2-bytes. This will enable most things to squash down into 2-bytes rather than 8-bytes. Certainly worthwhile for no-longer-in-use data. Quite possibly worthwhile always once combined with custom access methods that bypass eGet call chains. If objects are allocated and indexed from fixed size pools we may do better than a four-fold saving.

This is on my long list of things to do for QVTr/QVTc. It would be good to make it more generally useful as an alternative in-memory EMF representation.