spring-projects / spring-batch

Spring Batch is a framework for writing batch applications using Java and Spring
http://projects.spring.io/spring-batch/
Apache License 2.0
2.72k stars 2.35k forks source link

StepScope responsibilities can be assumed by Step (not ApplicationContext) [BATCH-364] #3215

Closed spring-projects-issues closed 16 years ago

spring-projects-issues commented 16 years ago

Dave Syer opened BATCH-364 and commented

StepScope responsibilities can be assumed by Step (not ApplicationContext). The aim (to clarify issues raised below), is to make scope="step" strongly advised but not mandatory for item readers and writers in simple steps. Application programmers are very welcome to use scope="step" where they need access to the context through StepContextAware, since this is consistent with other custom scope usages. They are also advised to use scope="step" wherever there is a possibility of more than one thread executing the same step - as in the case of a JMX launcher (see samples) or a web service that runs jobs. Step scope is not necessary for single JVM, single Job processes, but it would be recommended to use it anyway, in case the job is ever run in a mult-threaded container.


Affects: 1.0.0.m4

Issue Links:

spring-projects-issues commented 16 years ago

Dave Syer commented

Still in progress. Most if not all samples can probably now have the scope="step" removed from their readers and writers.

spring-projects-issues commented 16 years ago

Douglas C. Kaminsky commented

I always thought step scope was a very powerful and Spring-oriented feature. It seems odd to me for a subproject of Spring that has a very useful implementation of a Spring feature to deprecate it in favor of a less configurable and more programmatic solution. Can you please expand on the reasoning behind such a change?

spring-projects-issues commented 16 years ago

Dave Syer commented

There's no deprecation - step scope is very powerful and useful (hopefully). The object of this exercise is to make step scope into good practice, without it being required for a job to work. There are no other examples of Spring custom scopes where their use is mandatory, so we are just trying to make step scope more like the others. From the application programmer's point of view it should continue to behave as before, and we will recommend that step scope is used for readers and writers because they are usually stateful, but it won't be required.

spring-projects-issues commented 16 years ago

Douglas C. Kaminsky commented

Then I'm not sure what exactly you are proposing to do. How would this change the use case for step scope?

The consensus with Lucas, Wayne, Robert and Scott seemed to be that this issue needs a more detailed description. Can you describe in more detail what your intentions are?

spring-projects-issues commented 16 years ago

Douglas C. Kaminsky commented

Sorry, I just saw your edit of the issue description above. I disagree with your assessment that step scope is not necessary in single threaded environments. Take the example of a job that writes a file in one step, then reads it in the next. Depending on how the OS handles file access, you can create a race condition where the flushing of output does not actually occur in the first step because the file handle is not closed, so the second step only sees an empty file, and its read lock prevents the writer from writing.

This is something that actually happened to me a few months ago. It was also really tricky since without step scope the job worked in Windows but not under Solaris.

spring-projects-issues commented 16 years ago

Douglas C. Kaminsky commented

Caveat to the above comment: I do recognize that ItemStream does solve for some of this, but it feels much clunkier than step scope - step scope actually makes a lot of sense and since it is controlled via configuration, it is pretty elegant (minus the aop:scoped-proxy business).

spring-projects-issues commented 16 years ago

nebhale commented

(In reply to comment #5)

Take the example of a job that writes a file in one step, then reads it in the next. Depending on how the OS handles file access, you can create a race condition where the flushing of output does not actually occur in the first step because the file handle is not closed, so the second step only sees an empty file, and its read lock prevents the writer from writing.

Step scope doesn't solve this problem either. Only properly closing a reader or writer will solve this problem. The ItemStream contract makes this possible and the framework is guaranteeing a close on these resources when the step is completed (either successfully or with errors). I realize that the implementation of Step scope may close them as well, but the framework should not be dependent on this behavior. If I don't need to use step scope I shouldn't be forced to, and many of the usecases that we though Step scope was required to implement can instead be implemented with a singleton factory that returns a new instance every time.

In general the use of scopes is for situations where the framework or application is not driving the lifecycle, but rather is in a container that drives it. The perfect example is the request/session scopes where the servlet container is dirving lifecycle and there is no way an application could get proper access to the lifecycle of an instance. In our case, the framework is driving all executions and can instantiate objects at the proper time (I'm still a fan of a factory here and it's not unprecedented in the portfolio) removing the need for a magical scoping mechanism. You could use step scope where it's appropriate, but if I am only single threading a single execution through a single reader I should require all of the shenanigans.

spring-projects-issues commented 16 years ago

Dave Syer commented

Even single-threaded use will still still require step scope and if an application developer wants to take advantage of the StepContext injection callback. I think step scope is here to stay - we just don't require it any more.

And the StreamManager takes care of the close() callback, so even without scope="step" the files will be properly handled. I don't think this is clunky.

spring-projects-issues commented 16 years ago

Douglas C. Kaminsky commented

Re: Ben's last post

That's my point - using the existing init-method / destroy-method and InitializingBean / DisposableBean paradigms to properly close readers and writers via step scope makes things a lot cleaner and more in tune with existing Spring functionality.

Having a factory or a prototype scope is fine for creating new objects that don't clash with existing objects, but does nothing for resource management.

spring-projects-issues commented 16 years ago

Lucas Ward commented

Dave,

What's the status on this issue? As far as I can tell, open and close is called on ItemStream by the SimpleStreamManager. Only the ItemReader and ItemWriter that are wired into the step appear to be registered for this. Is there anything left on this issue? It's assigned to me, so I wanted to make sure.

spring-projects-issues commented 16 years ago

Dave Syer commented

That's where I think we are too, just checking. It's a little irritating actually - I had to go in and add a couple of instanceof to things like DelegatingItem* today. But if you are happy that open/close are actually called in the step we are good to close this one.

spring-projects-issues commented 16 years ago

Lucas Ward commented

I have a couple of things I would like to change, but they are more related to BATCH-365.

Mostly, it's that open() and restoreFrom(EC) could just as easily be open(EC), I also think the getExecutionAttributes() with aggregation can be removed as well, as long as there's a fairly trivial hook in the readers and writers to allow for specifying the key (a la name, but you could call it anything) I've mentioned it before, but again, more related to hte item stream rework.

spring-projects-issues commented 16 years ago

Dave Syer commented

OK, I lied. It looks like StepScope is going to die (BATCH-378, BATCH-398). I think this is a good thing on the whole (get ri dof that funkiness).

spring-projects-issues commented 16 years ago

wordywordy commented

Please consider to add back Step scope and add Job scope.

Scope concepts are important and good pattern for life-cycle management of resource. Job and step scope are definitely crucial concept for Spring Batch framework, which intended to make "Job" and "Step" as the first class citizen. If such a common and general requirement is needed to re-implement or mimic by custom approaches, the value (or attractiveness) of the Spring Batch framework will be largely reduced.

Another drawback of custom implement may induce incomplete (or improper) clean up of shared resources.

spring-projects-issues commented 16 years ago

Dave Syer commented

I am generally suspicious, but still a little sympathetic to this requirement. It is important to understand that any such volatile context will significantly reduce the robustness of a batch job - it just won't be there if the lights go out and the job has to be restarted.

If someone comes up with some good use cases we can think about how to implement them. I suggest you post on the forum and try and get a discussion going there.

spring-projects-issues commented 16 years ago

Dave Syer commented

Assume closed as resolved and released