WASdev / standards.jsr352.jbatch

Home of 'jbatch', a compatible implementation of the Jakarta Batch specification (and the former Reference Implementation for the JSR 352, Batch Applications for the Java Platform specification).
Other
21 stars 18 forks source link

JobContext.getTransientUserData() is not job-scoped #50

Closed yrodiere closed 7 years ago

yrodiere commented 7 years ago

In JBatch, a new JobContext seems to be created whenever a part of the job is executed in a new thread:

Thus, crucially, one cannot set the transient user data (JobContext.setTransientUserData()) at the beginning of a job (in a job listener, for instance), a re-use it during the whole execution: the new instances of JobContext would just have null instead of the previously set user data.

This is a bit strange, because one would expect a "job context" to be "job-scoped", and obviously it is not: it is thread-scoped.

I only witnessed this behavior in JBatch for the moment. For instance JBeret shares the whole JobContext across the whole job execution. But the specs do not seem to explicitly state anything about the transient user data, in particular they do not state what "transient" means exactly.

Is this a bug, or does it work as intended?

scottkurz commented 7 years ago

That's working as designed, see Section 9.4.1.1 of the spec:

There is one JobContext per job execution. It exists for the life of a job. There is a distinct JobContext for each sub-thread of a parallel execution (e.g. partitioned step).

Part of the idea here is that your application is forced to design around the case where, for example, the partitions run in separate JVMs (even though JBatch doesn't support this, natively anyway).

It's certainly come up before in spec discussions that it would be nice to have a richer set of capabilities for cases like this, which are pretty mainline paths, not just "corner cases". I don't think we have one thread or spec issue to point to right now though.

In the meantime, in a partition you can use the PartitionMapper to pass dynamic data from the top-level thread to the individual partitions. It's certainly not as simple as just set/get from JobContext though.

I'm not sure if JBeret views that as an "extension" or what, but perhaps the TCK doesn't enforce one behavior vs. another or not. ... but I'm only guessing, I haven't contributed to JBeret.

yrodiere commented 7 years ago

That's working as designed, see Section 9.4.1.1 of the spec:

I missed this part. Thank you.

In the meantime, in a partition you can use the PartitionMapper to pass dynamic data from the top-level thread to the individual partitions. It's certainly not as simple as just set/get from JobContext though.

Indeed it's not as simple: we can only pass strings, though it makes sense knowing that you envisioned remote execution of partitions. It basically means we have to re-implement at the partition level whatever set the transient user data in the first place. As, in my case, I wanted to make the retrieval code pluggable, that's not very practical. I guess I will have to only support customization through injection.

Anyway, that's the way the spec is, so I'll have to find a workaround for now. Thank you for answering this quick.

scottkurz commented 7 years ago

I'm going to close this now. Feel free to open a spec issue as a suggestion for 1.1 at: https://github.com/WASdev/standards.jsr352.batch-spec/issues (Though it might end up being more useful to address this from the more general angle of passing data from one step to another... and also, I don't think we'll forget about it if you don't open a spec issue).