Closed spring-projects-issues closed 16 years ago
Kyrill Alyoshin commented
I've created a patch that implements the functionality discussed on the forums. Test coverage is 99%. I'd be delighted to see it incorporated. (I will also submit it to BATCH-371).
Kyrill Alyoshin commented
I've noticed that 371 is closed, so I will not create a duplicate submission there. :-)
Evaluating 371...
Kyrill Alyoshin commented
Hmm... 371...
I see where you're going. I am afraid that FieldSet + LineAggregator combination does not provide the flexibility we need to generate real mainframe files. The key point is that alignment, padding (and maybe even text conversion strategies) need to be specified on a per element, or field, basis. The best I could do without having to break the existing public APIs is to tweak the Range class to provide this abstraction.
Unmapping the object into a FieldSet is an interesting idea, the problem is that each Field in a FieldSet must contain meta-data that describes how it is to be persisted in the flat file. And that is currently missing.
FieldSet, as an abstraction, borrows heavily from JDBC's ResultSet. ResultSet has been designed for 'read' operations. We do not use ResultSets to write into the DB. My gut feeling is that FieldSet is not the right abstraction for 'write' operations.
Please take a look at the patch to see what I mean in detail. Once again, my approach where the Range class acts as field descriptor solves this problem.
Dave Syer commented
I hear what you are saying about meta data. (But note that nothing you say has anything specifically to do with mainframes, or COBOL.) I'm not sure it is necessary in the FieldSet API - FieldSet might be inspired by ResultSet, but that doesn't mean we have to copy it all.
If all you need is to format with different alignments in each field, I'm sure there is an easier way. And I agree that FieldSet is not an output-oriented construct. But LineAggregator is, so I would prefer to do it there, along the lines of your suggestion with Range.
Lucas Ward commented
Kyrill,
It looks like your patch was created against m4, because of changes made in M5-SNAPSHOT, it won't compile without adapting it. Do you mind creating the patch against trunk?
Kyrill Alyoshin commented
Lucas,
Nope. Against the trunk. Just updated and rebuilt everything. Everything works. Even unit tests pass. What compiler errors do you get? You can send email with errors to me directly -kyrill007 at gmail-.
Kyrill
Kyrill Alyoshin commented
Dave,
But LineAggregator has been changed to operate on a FieldSet: #aggregate(FieldSet). Presumably we'd have to write a whole new FieldSet implementation, which would allow for individual field configurations. Today this is just names and values. But what we need is a FieldSet that has a SortedSet of some elements (Ranges?) that allow for the following property configurations:
It'd be interesting to know how this can be done with FieldSet... I'd suggest that refactor the #aggregate(FieldSet) method inot
What do you think?
Douglas C. Kaminsky commented
Addressing your comments directly -- a fieldset is EXACTLY a string array plus some extra functionality (e.g. the ability to name each column). There is no reason you wouldn't want to use a fieldset that I can think of. Your FieldSetUnmapper (as it's called today - I asked Dave to change the name but we'll see what happens) can do any sorting/etc (i.e. change your domain-specific object into a string array+) and your aggregator can handle formatting (i.e. transform each string from the string array+ into your desired output). Please explain exactly why you think the aggregator that solves the above-mentioned issue (BATCH-278) can't fulfill your use case.
Douglas C. Kaminsky commented
Further expanding on your question about needing to create a custom FieldSet, I'd also like to mention that I am totally opposed to FieldSet being an interface as opposed to a concrete implementation - the FieldSetMapper/Unmapper and LineTokenizer/Aggregator COMPLETELY OBVIATES the need to have different FieldSet implementations. There is no reason to use anything other than the DefaultFieldSet.
(The argument that FIeldSet parallels ResultSet is erroneous --- they represent very different concepts.)
Kyrill Alyoshin commented
Doug, what do you mean 'erroneous'?
This is straight from reference documentation:
"A FieldSet is Spring Batch's abstraction for typing fields from a flat file data source. It allows developers to work with file input in much the same way as they would work with database input. A FieldSet is conceptually very similar to a Jdbc Result Set."
What I am trying to find is where to store per field configuration properties?
They are once again: field position in a record, its property binding (can be literal), alignment, padding, and optional property editor.
You realize that each field in a FieldSet may need to be formatted individually? So, where are those "fields"? Where is the object that stores this config data?
What you're suggesting is that I preprocess my bean into a String array, then stick this String array into a FieldSet and then I aggregate it into line?
Kyrill Alyoshin commented
I suppose if we added a Field class (or renamed Range into Field) and made a FieldSet contain a sorted set of fields, this would be a different story. The it would really be a FIELD set, where a field contains property that describe it. We could then utilize it in a nice way for bi-directional navigation (as in BeanWrapperFieldSetMapper and BeanWrapperLineAggregator).
Hmm... the more I think about it, the more I like it. Of course, it may break existing public APIs. But it does seem elegant.
Dave Syer commented
What you're suggesting is that I preprocess my bean into a String array, then stick this String array into a FieldSet and then I aggregate it into line?
What's wrong with that? Actually, it seems to me to show only that FieldSet is already over-engineered for output - String[] would work just as well. I don't want to introduce extra meta data unless it is really necessary.
Douglas C. Kaminsky commented
By erroneous, I mean that although that was the original thought on the issue, it has since become increasingly clear to me that the similarities end at the naming. It is not a productive argument to argue that field set should contain some functionality or form for the sake of symmetry with result set.
Kyrill, you're trying to over-engineer things here. Your needs can be satisfied if you create a FieldSetUnmapper to map your object into a field set, then write a LineAggregator that contains a map from either field name (from the field set metadata) to format or from index number to format. That's the crux of BATCH-278 and the reason it has not yet been resolved is because I was waiting for Dave to fix BATCH-371. The solution to BATCH-278 is a LineAggregator that does exactly what you want - lets you specify width, alignment and padding for each field in a field set.
Dave, I think keeping field set is important at the very least for the naming of the fields - although I'm not thrilled with the idea, you could create an implementation that contains only the string array and naming aspects and use that for writing by default, while refactoring the current DefaultFieldSet to extend this new class and be used for reading.
Kyrill Alyoshin commented
OK. Let's attempt to really think it through.
Obviously, everything begins with a Tasklet. For each execution, the tasklet will create a bean populated with relevant data (to be transformed in a line later on).
The tasklet will use FlatFileItemWriter to write each individual line to a file.
The FlatFileItemWriter will delegate to the ItemTransformer to do the actual transformation of a bean into a line.
This is where troubles begin. The patch includes a BeanWrapperItemTransformer that transforms the bean into such a line and this is the end of story. But you say this is not the ideal solution. OK.
Now, I guess we have to figure out how to make FieldSetCreator + LineAggregator + some ItemTransform play all nice together. Speaking of the complexity...
I guess the first thing is to figure out what my BeanWrapperItemTransformer needs to refactored into? Presumably into BeanWrapperFieldSetCreator.
Where is the actual transformation of bean properties into padded String[] occurs? BeanWrapperFieldSetCreator? The point is it has to occur where the line configuration/metadata is stored (i.e where Range object are, since they describe the per field formatting needs).
If so, how does BeanWrapperFieldSetCreator get hooked into the transformation cycle (again the assumption is that we're going to use FlatFileItemWriter that delegates to a Transformer).
I'd love you, guys, to take a closer look at it. All of my needs are expressed in the unit tests in the patch.
Dave Syer commented
OK, here are the collaborations:
Forget about Tasklet, and ItemTransformer
FlatFileItemWriter delegates to a FieldSetCreator and then a LineAggregator
Observations:
There is no reason for FieldSetCreator to know about padding, but it does know about field->String conversion, and about the field names (if it wants to).
BeanWrapper doesn't know about "unwrapping", so BeanWrapperFieldSetCreator is sort of a misnomer, but I guess we can hack it for now. There is another issue somewhere (I think against SWF for Spring Binding) asking about this inverse binding problem (I think it was you, Kyrill, who logged that one?). It won't be dealt with properly in Spring Core until 3.0, so we might want to do something here now. The FieldSetCreator might like to provide the field names.
For your original use case the LineAggregator has to know about padding per field instead of per line. Something like a Range specification will do it. I would prefer not to overload the Range with extra information about padding, but I am prepared to consider it if it seems necessary.
Kyrill Alyoshin commented
Dave,
First, thanks for stepping up and clarifying things. I promise I will buy you a beer at Java ONE. :-)
Second, yep, it was I who logged that issue for Spring Binding. As you can see, I like bindings. :-)
Third, I suppose if BeanWrapperFieldSetCreator takes a sorted range as its state, the problem is solved. Existing BeanWrapperItemTransformer can be refactored into BeanWrapperFieldSetCreator.
The only thing that I think we have not nailed is where field conversion (i.e. padding, alignment) is performed. I mean if BWFSC takes a set of ranges, it probably will have to apply the "padding and co" to form the resulting String[] -> FieldSet. Then the line aggregator is almost not needed (i.e. StringUtils.arrayToDelimitedString(String[], ""). However if LineAggregator is the one to know about Ranges (i.e. per field formatting) then the logical flow is not clear.
Douglas C. Kaminsky commented
Posted a patch for http://jira.springframework.org/browse/BATCH-278 - this provides some of the functionality you discussed. The other piece is the FieldSetCreator interface which you can use to map any bean to a field set in a custom manner. Hope this helps.
Kyrill Alyoshin commented
I think Dave needs to step in and sort this whole thing out.
Dave, any plans to come back to this issue?
Mike Grudkowski commented
Hi,
My team is considering including this patch for our project, but are hesitant to go ahead before resolution has been made regarding approach. It appears there is an alternative approach taken in the patch to BATCH-278. To the point, has any resolution been made?
Mike
Dave Syer commented
The patch attached to this issue is sort of obsolete. The recommended approach would be to implement a FieldSetCreator. Which features of the patch were you interested in?
Mike Grudkowski commented
Thanks for the quick reply.
Well, we like Range enhancements and the RangePropertyEditor which lets us specify how each domain property we are interested in writing should be padded and aligned. We are currently taking the recommended approach, delegating to a tokenizer inside the FieldSetCreator implementation to create an array of tokens from the domain object. The tokenizer contains all the padding and alignment logic and is essentially hidden, i.e. not externalized.
So we are interested in the externalization features. Why is the approach considered obsolete?
Mike Grudkowski commented
Quick question Dave...Is the plan for BATCH-278 to include support for something similar to the logback formatting scheme?
I like the idea of specifying both padding, alignment, and replacement chars all in one conversion string instead of in several maps. That said, both approaches would help us out over what currently exists.
Kyrill Alyoshin opened BATCH-379 and commented
I am creating this issue as per our discussion on the forum: http://forum.springframework.org/showthread.php?t=50182
I will create a patch with the proposed implementation and attach it to this issue once it is done.
Affects: 1.0.0.m4
Attachments: