mminella / scaling-demos

Demos of the various methods of scaling Spring Batch
86 stars 60 forks source link

org.springframework.batch.item.file.FlatFileParseException: Parsing error - Spring Batch fixedwidth file #7

Open javaHelper opened 4 years ago

javaHelper commented 4 years ago

I am looking to read the fixed-width flatfile using Spring Batch generated from Mainframe system. This file does not have any delimiter and one complete records has 1-2000 characters or columns and 2001 to 4000 as characters length. The main issue is for few records, data spread across two lines or three lines and there I am facing issue while reading the code.

Could you please guide me here: https://stackoverflow.com/questions/63674370/caused-by-org-springframework-batch-item-file-transform-incorrectlinelengthexce

test

00200621 SUNDAY   0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         000000000000000000                                                                                                                                                                     
560000000000411999999992052300000000D 0000        0000000000010000000100000040000000000000  00000000                    NYNNVX      N N 0      N004 000100000001000100000001000100000001000100000001000100000001000100000001000100000001                                                YNYNYYNNNNNYNNNN0004000000070000000300010000000000000000000000020000000000000000NN1N                         N00NNNND                                                                                                                                                       001NNN              00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O840000000000000000000AN0201000000NNNC840                N  N00N A NN00400000000NNNNNUSAN       NNNN00000000000000NN141900INNNNNN   N                 N000000                 NN           200//0055//20000YNN MO    ¶200528000000       !!B3K555800000001A****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0005230000000000000000                                                                                                                                                                 
560000000000711111000614000300000000D   00        0000000002010000389762888580000000000000  00000000                    NNNN        N N 6      N    000000000000000000000000000000000000000000000000000000000000000000000000000000000000                                                YYYYYYYYNNNYYYNN0002000300040005000100060009000700000000000000080010001100000000NN N                         N00NNNND                                                                                                                                                       001NNF00000000007   00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B840000000000238976288NY0201000000NNNC840                N  N00Y N NN84000000000NNNNNUSAN       APPN00000000000000NN161912NBD NNN044N                 N000000                 YY           200//0011//00000YYU AL    ¶200109000000       !)+M555800000001A****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0000030000000000000000                                                                                                                                                                 
560000000004011072000326003304550456C 0121                  010000000000008400000400040011F 00090000        #Eg‰#E    NNNN       NNNYN0      N    000000000000000000000000000000000000000000000000000000000000000000000000000000000000                5C7E571E8C176C54                YYYNYYNNNNNNNNNN0002000300040000000100050000000000000000000000000000000000000000NN N                         N00                                                                                                                                                               NNN           MSI00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011110001111100011111000111111001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O840000000000000000000NY0102000000NNN                    NYNN00N0NNNN84000007504NNNNNUSAN   0MPFNNNN00130001000000NN160000NNN NNN   N                 N000000                 NN1600        155//1122//10009YYU     00  111111000000       Ô025500000255C****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0000330004550004560000                                                                                                                                                                 
560000000010111111000614000300000000D 0300        0000000002010000000000008580000000000000  00000000                    NNNN        N N 6      N840 000000000000000000000000000000000000000000000000000000000000000000000000000000000000                                                YYYYYYYYNNNYYYNN0002000300040005000100060009000700000000000000080010001100000000NN N                         N00NNNND                                                                                                                                                       001NNF00000000101   00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B840000000000200000000NY0202000000NNNC840                N  N00Y N NN85800000000NNNNNUSAN       APPN00000000000000NN161912NBDNNNN044N                 N000000                 YY           200//0011//10000YYU AL    ¶200117000000       !GÛI555800000001A****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0000030000000000000000                                                                                                                                                                 
560000000011111111000614000300000000D   00        0000000002010000389762888580000000000000  00000000                    NNNN        N N 6      N    000000000000000000000000000000000000000000000000000000000000000000000000000000000000                                                YYYYYYYYNNNYYYNN0002000300040005000100060009000700000000000000080010001100000000NN N                         N00NNNND                                                                                                                                                       001NNF00000000111   00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B840000000000238976288NY0202000000NNNC840                N  N00Y N NN84000000000NNNNNUSAN       APPN00000000000000NN161912NBD NNN044N                 N000000                 YY           200//0011//10000YYU AL    ¶200110000000       !­J555800000001A****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0000030000000000000000                                                                                                                                                                 
560000000045711072000326003300330033C   00                  010000000000008400000000000000  00000000                    NNNN       NNNYN0      N    000000000000000000000000000000000000000000000000000000000000000000000000000000000000                                                NNNNNYNNNNNNNNNN0000000000000000000000010000000000000000000000000000000000000000NN N3ABCD                    N00NNNN                                                                                                                                                           NNN              00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011101000000020002220000033303000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O840000000000000000000NY0102000000NNN                    NYNN00N0NNNN84000000000NNNNNUSAN   0MPFNNNN00000000000000NN160000NNN NNN   N                 N000000                 NN CP        199//0077//00009YNU     00'
080604000000        ²›p999700082768C****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0000330000330000330000                                                                                                                                                                 
560000000456 10072000326045604560456D 0100                  010000000020008401000000000000  00000000                    NYYYVXVXVX NYNYY5      Y840 0010000002000005000001000005000001000005000001000005000001000005000001000005000001005D60811AB3DB1E73                                YNNNYNNNNNNNNN  0002000000000000000100000000000000000000000000000000000000000000NN N                         N00                                                                                                                                                               NNN           MSI00022200000000000000003330000300003000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000112222303111121112200122223001333220033333300330000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000O840000000000000000000NY0102000000NNN                    NYNN00N0NNNN84077777701YNNNNUSAN   0   NNNN00000000000000NN160000NBB NNN   N                 N000000                 NN1600  000000144//0022//00009YYU MI  0016050507000000 USA   ¹œ…¸025500000255C****00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             00             0004560004560004560000                                                                                                                                                                 
javaHelper commented 4 years ago

@mminella - I've implemented code something like this, since my 2000 character record length is spread across two lines here.

@Configuration
public class JobConfig {
    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Bean
    public BlankLineRecordSeparatorPolicy recordSeparatorPolicy() {
        return new BlankLineRecordSeparatorPolicy();
    }

    @Bean
    public CacheManager cacheManager() {
        return new ConcurrentMapCacheManager();
    }

    @Bean
    public FlatFileItemReader<Customer> customerItemReader(){
        FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
        tokenizer.setNames("firstValue", "secondValue", "thirdValue", "fourthValue", "fifthValue", "sixthValue", 
                "seventhValue", "eighthValue", "ninethValue", "tenthValue", "dummyRange");
        tokenizer.setColumns(
                new Range(3, 6), new Range(7, 13), new Range(14,15), new Range(16,24), new Range(25, 28), new Range(29,32), 
                new Range(33, 36), new Range(1322, 1324), new Range(1406, 1408), new Range(1543, 1548), new Range(1549));

        CompleteFlatFileLineMapper mapper = new CompleteFlatFileLineMapper();
        mapper.setTokenMaxLength(2000);
        mapper.setTokenizer(tokenizer);
        mapper.setFieldSetMapper(new CustomerFieldSetMapper());
        mapper.setCacheManager(cacheManager());

        FlatFileItemReader<Customer> reader = new FlatFileItemReader<>();
        reader.setLinesToSkip(1);
        reader.setResource(new ClassPathResource("/data/test.conv"));
        reader.setLineMapper(mapper);
        reader.setStrict(false);
        return reader;
    }

    @Bean
    public CustomerItemWriter customerItemWriter(){
        return new CustomerItemWriter();
    }

    @Bean
    public Step step1() {
        return stepBuilderFactory.get("step1")
                .<Customer, Customer> chunk(10)
                .reader(customerItemReader())
                .writer(customerItemWriter())
                .build();
    }

    @Bean
    public Job job() {
        return jobBuilderFactory.get("job")
                .start(step1())
                .build();
    }   
}

CompleteFlatFileLineMapper.java

@Data
public class CompleteFlatFileLineMapper implements LineMapper<Customer>, InitializingBean {
    private LineTokenizer tokenizer;
    private FieldSetMapper<Customer> fieldSetMapper;
    private Integer tokenMaxLength;
    private CacheManager cacheManager;
    private boolean isAppend = false;

    @Override
    public void afterPropertiesSet() throws Exception {
        Assert.notNull(tokenizer, "The LineTokenizer must be set");
        Assert.notNull(fieldSetMapper, "The FieldSetMapper must be set");
    }

    @Override
    public Customer mapLine(String line, int lineNumber) throws Exception {

        // check if current line length is less than Max length
        if (line.length() < tokenMaxLength) {
            // Store the value is cache and append next line value
            if (cacheManager.getCache("row").get("cust") == null) {
                cacheManager.getCache("row").put("cust", line);
                isAppend = true;
            } else {
                line = this.getFullLine(line);
                isAppend = false;
                cacheManager.getCache("row").clear();
            }
        }

        if (!isAppend) {
            Customer c = fieldSetMapper.mapFieldSet(tokenizer.tokenize(line));
            c.setLineNumber(lineNumber);
            return c;
        }

        return new Customer();
    }

    private String getFullLine(String line) {
        String finalLine;
        StringBuilder sb = new StringBuilder((String) cacheManager.getCache("row").get("cust").get());
        sb.append(line);
        finalLine = sb.toString();
        return finalLine;
    }
}

Please confirm if above is the correct ways of doing it ? Or please suggest proper solution with sample example. Thanks in advance !