art-framework-suite / art-root-io

0 stars 2 forks source link

SamplingInput only wants to do one event #18

Closed gaponenko closed 1 month ago

gaponenko commented 2 months ago

Hello,

Jobs that use the SamplingInput source process only a single event by default. This can be seen from mu2e --print-description SamplingInput

My expectation is that the job should continue until one of the input datasets is exhausted, similar to what RootInput does. (Unlike RootInput we may have multiple datasets here, so only one of them will be fully used.) Unfortunately setting maxEvent to a large number makes the source recycle input events instead of exiting at the end of input, so I do not see a way to efficiently use inputs without oversampling.

Andrei

knoepfel commented 2 months ago

We will analyze the current implementation and find a way to provide what you need.

sophiemiddleton commented 1 month ago

Hi @knoepfel , just checking on this, has there been any progress? Thanks, Sophie

knoepfel commented 1 month ago

@sophiemiddleton, I talked with @gaponenko a couple weeks ago, and he mentioned that the job does end whenever one of the datasets is exhausted. However, he requested that we provide the ability to specify --nevts -1 (as we support in RootInput) so that the user does not need to specify an otherwise arbitrary number of events to process.

Do you have other information to add here?

sophiemiddleton commented 1 month ago

thanks, yes I think the -1 option is what we would want eventually but I am trying a work around from Andrei. I will let you know if it works

knoepfel commented 1 month ago

Commits https://github.com/art-framework-suite/art/commit/4fdb49687444502f9a9c175545671759f57dc327 and https://github.com/art-framework-suite/art-root-io/commit/b1b889f19b4908bee0a0723328a01fd94479fdff now implement the desired behavior, allowing a specification of -n -1 to indicate that the job should continue until a dataset is exhausted.

Please advise when a new release of art/art-root-io is required.