jmcnamara / XlsxWriter

A Python module for creating Excel XLSX files.
https://xlsxwriter.readthedocs.io
BSD 2-Clause "Simplified" License
3.65k stars 633 forks source link

Bug: Adding a lot of pictures to the workbook causes a picture error #913

Closed ye-xue closed 2 years ago

ye-xue commented 2 years ago

Current behavior

Add multiple images, the images are different, but the result is that the two pictures are the same. Snipaste_2022-10-11_13-05-01

Expected behavior

Different image data shows different image content

Sample code to reproduce

import io

import qrcode
from xlsxwriter import Workbook

def get_io_qr(qr_str, version=5, box_size=2, border=1) -> io.BytesIO:
    qr = qrcode.QRCode(version=version, box_size=box_size, border=border)
    qr.add_data(qr_str)
    qr_data = io.BytesIO()
    qr.make(fit=True)
    img = qr.make_image()
    img.save(qr_data)
    return qr_data

with Workbook('test.xlsx') as workbook:
    worksheet = workbook.add_worksheet('test')

    label_str = '1137249488/6000|KMC51030079V001|KM51006110V001|1'
    qr_data = get_io_qr(label_str, version=4, box_size=2)
    # qr_data.getvalue() = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00F\x00\x00\x00F\x01\x00\x00\x00\x00YU\xdf)\x00\x00\x013IDATx\x9cmR1J\x04A\x10\xac\xbd\xd9\xe4`d6\x15L\x0e6=\x18\x18\xc1d\x8f\xf1\x01\xf7\x18S\xc1\xdc\x07\xf8\x12\x7f02\x86\xbd\xf4\x82\xe9\x81\xa1\xe9,+\x98\xcc0\x06\x82\xd7+vXtuU\x17\xd5T\xfcL\xd9\xe0w\x02\xb2\xe1H\xd0glc\xf2\xb4\xeb5j\x10X\xe9\xf0t\xe8\xf1\x02\x81\x01j\xe0b\xb1\xc6z\x1c\xd0I\xac\xe1\xbd\xc2\xa8\x8a\x1b\xcfX;\xb7\x04\x144B\x17\xb5\xb2\xc9\x95k\x16\xf7\xa6R\xfav\xda\xcd\xb7\xd2\xb3\xa9\xc6\x18E$\xee\xb92\xdb\xf2\x88\xfe(\xb8\xa7\xf8^\xde\xb6\xf7\xa7Ip\x93\x0e\t\xc6\xd6A\xde\x8bI3a\x91^\x16\xadRr\xd1f\x89U\xa7\xfc0\x90\xd8kq\x111\x1f\x97\xd7$\xf6"{\x1b-\xb2\xe4&\xcb\xce;kH\xfef@\xcfZ\xed\xad\xd4\xc8n`\xf6Q\xcb\xfcP\xec\xf4e\xc7\x07\xb1Gy\t:\xd0\xca\x1f\x19\xa8\\y\xa5\x1b(\x9aA\xc1\xfc\xc99z\xe6\xa0\xd7=\xc8:\xea(s\xe1\\\x17_Y\xfe\xbb)\x1d.\xcb\xf5|\x93d\x06(\x1f\xa7O\xbd\xed\xd6\xdd\xb8\x9a\xb7w\xb2\x07\xe0\x9c\xac\xd6\x9eV\xfe@.\xd8\xe8\x84n\xf3OO\xbf\x01\x0c\x14\x92\x9dA\x96\x0f]\x00\x00\x00\x00IEND\xaeB`\x82'
    worksheet.insert_image(0, 0, label_str,
                           {'image_data': qr_data, 'x_offset': 5, 'y_offset': 5})
    label_str = '1137249493/6000|KMC51030079V001|KM51123515V000|1.0'
    qr_data = get_io_qr(label_str, version=4, box_size=2)
    # qr_data.getvalue() = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00F\x00\x00\x00F\x01\x00\x00\x00\x00YU\xdf)\x00\x00\x013IDATx\x9cm\x92\xc1j\xc30\x0c\x86\xe5$0\x06\x02\xe7:\xd8\xa5\xd0\xeb\xc0#\x85^\x1c\\\xd8u\xaf\xb6\x87)\xec\x01R<\x18\x03\x15\xfb:\xd8C8\xd4\xd0\x8b\x8dv\x18,\xca\x98\x8e\x1f?\xfa\x84$\xc5\xf0S\xb5\x81\xdf\x9a\xa0\xb8\xe0\tpa\x9d.\xef\xc7\xf6\xab\xf2yaM\xed\xf1i\xdc\xc2\t\x04\x03\xc8\x14\xaa\x8153q\x84^2\x08\x85\x1d\x05b\x12\x8e\xb9\xa3\x17\xa8\xa0\x84\x17\x98I\x17\x0e\\D\xbfX!wq3\x1f\x16\xa6\xc2a\xa3#\xe4\x0f\xe9H%\xb1\xf5!Y\xc9\xf4\x84:\x18Z9\xcc\xc5\x806,r\xca\xdb8\xeb\xba\xbb\xdc\xc9\x9c%J\x83\x071\x0b\x14\xed\xc8Y\xbb\xea\xe7\xa9\xe0\x84\xecE\xae3\xb0\xf3y\xab\x8a\x9c\xcf\x97\x00n\xb0\x9a\xe4No\x1e\xf1\x88\xd7\x07#w\x05L\x81\x9d\x17\x8e&s\xdd\xd7\xb7\xd7*sDH\x98J\x12^EW\x1cS\x9bo\xa3\xbc/\x9ab[\xd0\xf2\xbe\xeer\xba\xb6\x04\xf3\xb3p\x9c\x94\x81s\x1e[\xe1ht\x81:X/\xfb5\xb5\x87\xfb\xda\xcf\xfb$g\xc1\x84ap\x88\xab?\xd0i\xfe\x8c\x7f\xfe\xa0xDGr\xa7\x13L<\x19?\xc8{\xfc\xf3\xa7\xdf\xc6\x99\x9a\x1c\x19\x0f\x96\x05\x00\x00\x00\x00IEND\xaeB`\x82'
    worksheet.insert_image(10, 0, label_str,
                           {'image_data': qr_data, 'x_offset': 5, 'y_offset': 5})

Environment

- XlsxWriter version:3.0.2
- Python version:3.8.10
- Excel version:office2019
- OS:windows 10 21H2

Any other information

No response

OpenOffice and LibreOffice users

jmcnamara commented 2 years ago

Thanks for the detailed report. That looks like a bug. I'll look into it.

jmcnamara commented 2 years ago

I don't see the same result as you. I modified your sample code to move the second image closer to the first for comparison and the images are different:

Screenshot 2022-10-11 at 11 35 14

Here is an image comparison of the images in the output file (renamed to qrcode.xlsx). The grey areas are different between the 2 images:

Screenshot 2022-10-11 at 11 36 39

The QR codes read from the images is also different.

ye-xue commented 2 years ago

I tested it on another computer and the result is still the same. What does the code you use look like? I'll test it

jmcnamara commented 2 years ago

Here is the code I ran. It is the same as yours, without the commented out code, with the second image moved to cell C1 and with the filename changed:

import io

import qrcode
from xlsxwriter import Workbook

def get_io_qr(qr_str, version=5, box_size=2, border=1) -> io.BytesIO:
    qr = qrcode.QRCode(version=version, box_size=box_size, border=border)
    qr.add_data(qr_str)
    qr_data = io.BytesIO()
    qr.make(fit=True)
    img = qr.make_image()
    img.save(qr_data)
    return qr_data

with Workbook('gh913.xlsx') as workbook:
    worksheet = workbook.add_worksheet('test')

    label_str = '1137249488/6000|KMC51030079V001|KM51006110V001|1'
    qr_data = get_io_qr(label_str, version=4, box_size=2)

    worksheet.insert_image(0, 0, label_str,
                           {'image_data': qr_data, 'x_offset': 5, 'y_offset': 5})

    label_str = '1137249493/6000|KMC51030079V001|KM51123515V000|1.0'
    qr_data = get_io_qr(label_str, version=4, box_size=2)

    worksheet.insert_image(0, 2, label_str,
                           {'image_data': qr_data, 'x_offset': 5, 'y_offset': 5})

I've attached the output file: gh913.xlsx

ye-xue commented 2 years ago

Using the same code I ran out of still the same problem. When checking whether it is the same picture, the Windows side determines that the two pictures are the same. image gh913.xlsx

jmcnamara commented 2 years ago

That is a bit weird. I see the same as you with Excel 15 but not other versions:

I'll look into it a bit more.

jmcnamara commented 2 years ago

Test images, for reference: image1 image2

jmcnamara commented 2 years ago

Reference files for testing:

The gh913_excel??.xlsx Excel files were created by opening a new blank worksheet in Excel, and inserted image1.png (from above) in Cel A1 and image2.png in cell C1, and then saving the file. The gh913_xlsxwriter.xlsx file was created using XlsxWriter.

Note, all of these files show different images in Excel12 and Excel16 but the same in Excel15. Even when inserting the images in Excel15 they looked the same. You can test this yourself.

So this looks like an Excel15 issue since it is clearly reproducible with files created in other versions of Excel. Again, you can verify this for yourself.

jmcnamara commented 2 years ago

Also, interestingly, the file created with Excel15 only has one unique image in it, although I double checked that I had added both:

 unzip -l ~/Desktop/gh913_excel15.xlsx
Archive:  /Users/John/Desktop/gh913_excel15.xlsx
  Length      Date    Time    Name
---------  ---------- -----   ----
     1202  01-01-1980 00:00   [Content_Types].xml
      588  01-01-1980 00:00   _rels/.rels
     2180  01-01-1980 00:00   xl/workbook.xml
      557  01-01-1980 00:00   xl/_rels/workbook.xml.rels
      954  01-01-1980 00:00   xl/worksheets/sheet1.xml
     8390  01-01-1980 00:00   xl/theme/theme1.xml
     1618  01-01-1980 00:00   xl/styles.xml
     2528  01-01-1980 00:00   xl/drawings/drawing1.xml
      364  01-01-1980 00:00   xl/media/image1.png
      299  01-01-1980 00:00   xl/worksheets/_rels/sheet1.xml.rels
      292  01-01-1980 00:00   xl/drawings/_rels/drawing1.xml.rels
      619  01-01-1980 00:00   docProps/core.xml
      785  01-01-1980 00:00   docProps/app.xml
---------                     -------
    20376                     13 files

Note, this contains image1.png only. Excel usually only stores 1 copy of each unique image. So it seems that Excel15 thinks that image1.png and image2.png are the same.

Here is the same output for the Excel16 file:

~/Development/xlsx/inspect$ unzip -l ~/Desktop/gh913_excel16.xlsx
Archive:  /Users/John/Desktop/gh913_excel16.xlsx
  Length      Date    Time    Name
---------  ---------- -----   ----
     1202  01-01-1980 00:00   [Content_Types].xml
      588  01-01-1980 00:00   _rels/.rels
      557  01-01-1980 00:00   xl/_rels/workbook.xml.rels
     2249  01-01-1980 00:00   xl/workbook.xml
     2528  01-01-1980 00:00   xl/drawings/drawing1.xml
      364  01-01-1980 00:00   xl/media/image1.png
     1618  01-01-1980 00:00   xl/styles.xml
      427  01-01-1980 00:00   xl/drawings/_rels/drawing1.xml.rels
      364  01-01-1980 00:00   xl/media/image2.png
     8390  01-01-1980 00:00   xl/theme/theme1.xml
      971  01-01-1980 00:00   xl/worksheets/sheet1.xml
      299  01-01-1980 00:00   xl/worksheets/_rels/sheet1.xml.rels
      619  01-01-1980 00:00   docProps/core.xml
      795  01-01-1980 00:00   docProps/app.xml
---------                     -------
    20971                     14 files

Note, this contains 2 image files: image1.png and image2.png.

So this, to me, looks like an Excel15 bug.

ye-xue commented 2 years ago

I use Microsoft Office Home and Student 2019 to open these files with the same pattern. There are no other versions of Office to test at this time. image

ye-xue commented 2 years ago

The major version we use is Office2019, is there a way to avoid this problem?

jmcnamara commented 2 years ago

I use Microsoft Office Home and Student 2019 to open these files with the same pattern.

If you see the same image when you open the Excel12 and Excel16 file then that implies that this isn't an XlsxWriter issue since those files were created in Excel and not XlsxWriter.

The major version we use is Office2019, is there a way to avoid this problem?

I don't think so. It looks like an Excel bug.

ye-xue commented 2 years ago

Thank you so much! Can you file this bug with Microsoft?

jmcnamara commented 2 years ago

Thank you so much! Can you file this bug with Microsoft?

Sorry, no. I don't know how. If you submit it you can point back to the test files here.

Closing.

jmcnamara commented 2 years ago

P.S., it may be worth installing any available Excel updates to see if this issue was fixed in an update.