etf-validator / governance

ETF Steering Group and the Technical Committee documents
1 stars 2 forks source link

ETF Performance review #14

Closed robsgnao closed 6 years ago

robsgnao commented 6 years ago

ETF Improvement Proposal (EIP)

Background and Motivation:

Performance optimisation is required to reduce startup time and validation time; reduction of startup time will simplify cloud deployment horizontal scaling, while reduction in validation time will be helpful while integrating ETF with INSPIRE Geoportal, or any other Metadata related workflow/pipeline. Thefollowing possible issues hasve been identified:

Proposed change

It is proposed to investigate the following points:

Alternatives

Funding

JRC will ask its current contractor to perform a performance assessment, and identify quick wins (if any);

Additional information

deployments and/or Executable Test Suites. n/a

cportele commented 6 years ago

@jonherrmann: TC to review options and report back to SG.

See minutes of the 3rd SG meeting.

jonherrmann commented 6 years ago
  • maintain a local cache of the most-used schema files, and maintain a time-bound cache for ad-hoc schema downloads (e.g. it is reasonable to expect same set of custom schemas for the same local administration);

See https://github.com/etf-validator/etf-webapp/issues/85 .

  • load (at least) the most used Test Suites in memory, and refresh them on a given time frame (if needed);

The ETS resource files could be pre-loaded in memory. In the database, the object cache could be reactivated. This will increase memory consumption. However, since the test engines need a little time to initialize, it would be good to profile this before making any changes.

  • refresh DB at given intervals, but not during startups;

See comment above, it would be good to profile and describe the relevant building blocks. Maybe a parameter could be introduced to deactivate some time-consuming consistency checks during the startup.

  • where possible, parallelise test suite execution;

Parallelism could be achieved on two levels:

If a Test Run contains multiple ETSs, they could be run in parallel. It still remains to clarify how dependencies are treated; just ignore dependencies and start all ETS or wait until dependant ETSs have finished. It could also be important, whether this is a group of service tests that put the service in different states. Starting and managing the threads would be done by ETF.

The ETS level includes all levels below an ETS, the Test Modules, Test Cases, Test Steps, Assertions. Its up to the test developer and the capabilities of the test engine how these items are parallelized -if they are parallelizable at all.

However there must be a communication between the ETS and the Framework:

A good first step to improve the behaviour under load would be the implementation of the improvement described in https://github.com/etf-validator/etf-webapp/issues/169. Another point that would increase performance at a reasonable effort would be the parallelisation of ETSs in TestRuns. For another ETF project we are realizing, the first point -schema validation cache- is also of interest. We will open a separate EIP for it soon.

In other aspects I still see a need for further analysis and clarification.

carlospzurita commented 6 years ago

We find very interesting the possibility of parallel execution at the ETS level. Is there any documentation on this subject-i.e., syntax, implementation on ETF...?

jonherrmann commented 6 years ago

An interface for - let's say parallelisation contracts between ETF and ETS does not yet exist. It also requires a manager object that monitors the system usage and allocates the threads or thread-limits, maybe even able to cancel/pause threads. It would be an interesting idea, but due to the dynamics of the system load, it should be well thought through.

We haven't really used parallel execution on the ETS level, but briefly discussed some ideas. This could be a starting point for data tests and this for service tests.

cportele commented 6 years ago

Can be closed as all sub-topics have been moved to specific EIPs. The last one is #49.

michellutz commented 6 years ago

I have also removed it from the project board.