broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
997 stars 361 forks source link

Metadata sawtooth in throughput / queue #4400

Open Horneth opened 6 years ago

Horneth commented 6 years ago

There is a strange behavior observed in the gotc Cromwell where the Metadata queue grows really high with a non-existent throughput (meaning events are not being written). Then all of a sudden "something" seems to "unlock" and they all get written very quickly, clearing the queue. Corresponding grafana: screen shot 2018-11-16 at 5 13 02 pm

ruchim commented 6 years ago

@Horneth any idea what the metadata summarization interval is?

Horneth commented 5 years ago

If it's the default it will be 2 seconds. I doubt it was changed in the config but @hjfbynara could confirm (services.MetadataService.config.metadata-summary-refresh-interval is the config path). I don't think the summarization is related though because this is even before the events are written to the METADATA_ENTRY table. Unless the summarization somehow creates a lock on that table under some circumstances preventing writing of new events...