spring-projects / spring-batch-extensions

Spring Batch Extensions
242 stars 258 forks source link

Issue with spring-batch-excel using Resource which might not have getFile() implemented and does not throw a FileNotFoundException exception #89

Closed mbazos closed 2 years ago

mbazos commented 2 years ago

The following code is used to read the excel sheets: StreamingXlsxItemReader.java:

    protected void openExcelFile(Resource resource, String password) throws Exception {
        try {
            File file = resource.getFile();
            this.pkg = OPCPackage.open(file, PackageAccess.READ);
        } catch (FileNotFoundException var4) {
            this.inputStream = resource.getInputStream();
            this.pkg = OPCPackage.open(this.inputStream);
        }

        XSSFReader reader = new XSSFReader(this.pkg);
        this.initSheets(reader, this.pkg);
    }

PoiItemReader.java:

    protected void openExcelFile(Resource resource, String password) throws Exception {
        try {
            File file = resource.getFile();
            this.workbook = WorkbookFactory.create(file, password, false);
        } catch (FileNotFoundException var4) {
            this.inputStream = resource.getInputStream();
            this.workbook = WorkbookFactory.create(this.inputStream, password);
        }

        this.workbook.setMissingCellPolicy(MissingCellPolicy.CREATE_NULL_AS_BLANK);
    }

It's nice that there is a fallback to attempt to use resource.getInputStream() but I ran into a problem with this spring-cloud project which uses a GoogleStorageResource and the issue is that the exception being thrown is UnsupportedOperationException which isn't handled by the code above. Please see here: https://github.com/spring-attic/spring-cloud-gcp/blob/main/spring-cloud-gcp-storage/src/main/java/org/springframework/cloud/gcp/storage/GoogleStorageResource.java#L244

To fix this wondering if it makes sense to check the Resource if it's a file and if that's true call getFile() otherwise attempt to use getInputStream(). So it would look like this:

                try {
                    if(resource.isFile()) {
                        File file = resource.getFile();
                        this.pkg = OPCPackage.open(file, PackageAccess.READ);
                    } else {
                        this.inputStream = resource.getInputStream();
                        this.pkg = OPCPackage.open(this.inputStream);
                    }
                } catch (Exception ex) {
                    throw new IllegalArgumentException("Unable to read data from resource", ex);
                }

                XSSFReader reader = new XSSFReader(this.pkg);
                this.initSheets(reader, this.pkg);
mdeinum commented 2 years ago

As this code was written years ago, the isFile method wasn't present back then (it was added in 5.0 if I read the docs). As we now use Spring Batch 4.3 which in turn uses Spring 5.3 we should be able to safely refactor this.

Thanks for the report.

mdeinum commented 2 years ago

I've implemented this with the isFile check. If you checkout the sources and build a version for yourself you could test this.

mbazos commented 2 years ago

@mdeinum tested the fix and it works great, do you know when this will be released so I can pull it from maven central?

mdeinum commented 2 years ago

I'll go through the issues see if there are some more low hanging fruit swe can fix and release a 0.1.1 version. Shouldn't be too long.