monitorjbl / excel-streaming-reader

An easy-to-use implementation of a streaming Excel reader using Apache POI
Apache License 2.0
959 stars 345 forks source link

read picture #258

Open archine opened 2 years ago

archine commented 2 years ago

How to get cell pictures when importing

pjfanning commented 2 years ago

With Apache POI, XSSFWorkbook and XSSFSheet, you can iterate over the pictures using:

      Drawing<?> drawingPatriarch = sheet0.getDrawingPatriarch();
      List<XSSFPicture> pictures = new ArrayList<>();
      for (Shape shape : drawingPatriarch) {
        if (shape instanceof XSSFPicture) {
          pictures.add((XSSFPicture)shape);
        }
      }

XSSFPicure has XSSFClientAnchor getClientAnchor() and XSSFClientAnchor has getCol1(), getCol2(), getRow1(), getRow2() - these values will let you know which cells are covered by the picture.

excel-streaming-reader does implement the support for shapes and pictures.

My fork does.

This is a unit test in my fork that demos some of the features.

  @Test
  public void testGetPicturesWithReadShapesEnabled() throws Exception {
    try (
            InputStream stream = getInputStream("WithDrawing.xlsx");
            Workbook workbook = StreamingReader.builder().setReadShapes(true).open(stream)
    ) {
      List<? extends PictureData> pictureList = workbook.getAllPictures();
      assertEquals(5, pictureList.size());
      for(PictureData picture : pictureList) {
        XlsxPictureData xlsxPictureData = (XlsxPictureData)picture;
        assertTrue("picture data is not empty", picture.getData().length > 0);
        assertArrayEquals(picture.getData(), IOUtils.toByteArray(xlsxPictureData.getInputStream()));
      }
      Sheet sheet0 = workbook.getSheetAt(0);
      sheet0.rowIterator().hasNext();
      Drawing<?> drawingPatriarch = sheet0.getDrawingPatriarch();
      assertNotNull("drawingPatriarch should not be null", drawingPatriarch);
      List<XSSFPicture> pictures = new ArrayList<>();
      for (Shape shape : drawingPatriarch) {
        if (shape instanceof XSSFPicture) {
          pictures.add((XSSFPicture)shape);
        } else {
          //there is one text box and 5 pictures on the sheet
          XSSFSimpleShape textBox = (XSSFSimpleShape)shape;
          String text = textBox.getText().replace("\r", "").replace("\n", "");
          assertEquals("Sheet with various pictures(jpeg, png, wmf, emf and pict)", text);
        }
        assertTrue("shape is an XSSFShape", shape instanceof XSSFShape);
        assertNotNull("shape has anchor", shape.getAnchor());
      }
      assertEquals(5, pictures.size());
      Sheet sheet1 = workbook.getSheetAt(1);
      sheet1.rowIterator().hasNext();
      assertNull("sheet1 should have no drawing patriarch", sheet1.getDrawingPatriarch());
    }
  }
archine commented 2 years ago

xlsx-streamer doesn't seem to support reading pictures. When trying to obtain using streamingsheet, the following occurs: image

pjfanning commented 2 years ago

My fork does - https://github.com/pjfanning/excel-streaming-reader/blob/main/src/main/java/com/github/pjfanning/xlsx/impl/StreamingSheet.java#L198

archine commented 2 years ago

This seems to be a new project

archine commented 2 years ago

Excuse me, because the poi project cannot mention the inssue, I want to ask you how to deal with the problem that the fill color of the picture cannot be read when the picture of pptx is read through the poi?