alibaba / easyexcel

快速、简洁、解决大文件内存溢出的java处理Excel工具
https://easyexcel.opensource.alibaba.com
Apache License 2.0
32.26k stars 7.51k forks source link

导入大量excel数据解析失败 #2605

Open Dinny-xu opened 2 years ago

Dinny-xu commented 2 years ago
    InputStream inputStream = Jsoup.connect(excelUrl).ignoreContentType(true).execute().bodyStream();
    //实例化实现了AnalysisEventListener接口的类
    ExcelListener<ImportTeacherInfoVO> listener = new ExcelListener<>();
    //读取excel
    EasyExcel.read(inputStream, ImportTeacherInfoVO.class,
            listener).sheet(0).headRowNumber(3).doRead();

    List<ImportTeacherInfoVO> excelList = listener.getDataList();

excel格式为.xlsx, 总共有23个字段,8000条数据, 抛如下异常

org.apache.poi.openxml4j.exceptions.InvalidOperationException: Could not open the specified zip entry source stream at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:205) at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:187) at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:161) at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:142) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:295) at com.alibaba.excel.analysis.v07.XlsxSaxAnalyser.readOpcPackage(XlsxSaxAnalyser.java:173) at com.alibaba.excel.analysis.v07.XlsxSaxAnalyser.(XlsxSaxAnalyser.java:68) at com.alibaba.excel.analysis.ExcelAnalyserImpl.choiceExcelExecutor(ExcelAnalyserImpl.java:103) at com.alibaba.excel.analysis.ExcelAnalyserImpl.(ExcelAnalyserImpl.java:55) at com.alibaba.excel.ExcelReader.(ExcelReader.java:27) at com.alibaba.excel.read.builder.ExcelReaderBuilder.build(ExcelReaderBuilder.java:202) at com.alibaba.excel.read.builder.ExcelReaderBuilder.sheet(ExcelReaderBuilder.java:239) at com.alibaba.excel.read.builder.ExcelReaderBuilder.sheet(ExcelReaderBuilder.java:231) at cn.sxw.pros.schoolcenter.web.service.ImportExportService.parseBiz(ImportExportService.java:211) at cn.sxw.pros.schoolcenter.web.service.ImportExportService.access$000(ImportExportService.java:95) at cn.sxw.pros.schoolcenter.web.service.ImportExportService$1.call(ImportExportService.java:166) at cn.sxw.commons.basic.hystrix.AbstractHystrixCommand.run(AbstractHystrixCommand.java:61) at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:302) at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:298) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.Observable.unsafeSubscribe(Observable.java:10151) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51) at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) at rx.Observable.unsafeSubscribe(Observable.java:10151) at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41) at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) at rx.Observable.unsafeSubscribe(Observable.java:10151) at rx.internal.operators.OperatorSubscribeOn$1.call(OperatorSubscribeOn.java:94) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.EOFException: Unexpected end of ZLIB input stream at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158) at java.util.zip.ZipInputStream.read(ZipInputStream.java:194) at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:220) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.(ZipInputStreamZipEntrySource.java:132) at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.(ZipInputStreamZipEntrySource.java:56) at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:203) ... 46 common frames omitted

zhuangjiaju commented 2 years ago

这个需要提供下excel

Dinny-xu commented 2 years ago

这个需要提供下excel

URL提供:https://cdn.xycloud.site/%E6%B5%8B%E8%AF%95%E6%95%B0%E6%8D%AE1%E4%B8%87%E4%B8%AA.xlsx

可根据该url 进行测试,代码以上示例

zhuangjiaju commented 1 year ago

是不是你上传的代码有问题?我实际测试了没问题

    @Test
    public void test() throws Exception {
        // File file = TestFileUtil.readUserHomeFile("test/test4.xlsx");
        //        File file = TestFileUtil.readUserHomeFile("test/test6.xls");
        File file = new File("/Users/zhuangjiaju/IdeaProjects/easyexcel/src/test/resources/converter/converter07.xlsx");

        List<Object> list = EasyExcel.read(
                "/Users/zhuangjiaju/Downloads/测试数据1万个.xlsx")
            //.useDefaultListener(false)
            .sheet(0)
            .headRowNumber(0).doReadSync();
        LOGGER.info("数据:{}", list.size());
        for (Object data : list) {
            LOGGER.info("返回数据:{}", CollectionUtils.size(data));
            LOGGER.info("返回数据:{}", JSON.toJSONString(data));
        }
    }
Dinny-xu commented 1 year ago

我们俩的代码都不一样,你看我上传的代码是Jsoup解析的,你这是new File 。

persysc commented 1 year ago

我也遇到这个问题了 数据量是50w

Dinny-xu commented 1 year ago

如果你的代码跟我上面一样是使用的流式解析的话,一定会出错的,可能跟easyexcel 本身没有问题,因为还没走到那一步,可以把url 转文件进行读取就没问题了。