iris-hep / idap-200gbps-atlas

benchmarking throughput with PHYSLITE
6 stars 1 forks source link

Large Datasets get duplicated files #53

Closed gordonwatts closed 4 months ago

gordonwatts commented 4 months ago

The problem of extra files:

image

This dataset has 64803 files. And, indeed, it gets up to that number and then stabilizes for 5 minutes. And then new files start coming in again.

image

We believe this is because it takes more than 30 minutes for the files to be inserted. RabbitMQ thinks the DID finder has died, takes the message back, and then re-sends it.

This is from #49 .

BenGalewsky commented 4 months ago

This is true. The default message timeout is 30 minutes

We can fix this by adding this to values.yaml

rabbitmq:
  # Set the message timeout - channel wil be closed if a message is not acked
  # within this time. Set to 1 hour.
  extraConfiguration: |-
    consumer_timeout = 3600000
gordonwatts commented 4 months ago

Ok - this was fixed! We'll stress test it in Week 5 when we get all the upgrades in and ready to re-run tests.