clalancette / pycdlib

Python library to read and write ISOs
GNU Lesser General Public License v2.1
147 stars 38 forks source link

Stack trace while formatting time fields in ISO headers #107

Closed mikalstill closed 1 year ago

mikalstill commented 1 year ago

I don't fully have a handle on this yet, but some values from time.time() cause a stack trace like this:

$ /srv/shakenfist/venv/bin/python
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycdlib
>>> iso = pycdlib.PyCdlib()
>>> iso.new(interchange_level=4,
...                 joliet=True,
...                 rock_ridge='1.09',
...                 vol_ident='config-2')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/srv/shakenfist/venv/lib/python3.9/site-packages/pycdlib/pycdlib.py", line 3895, in new
    self.pvd = headervd.pvd_factory(sys_ident_bytes, vol_ident_bytes,
  File "/srv/shakenfist/venv/lib/python3.9/site-packages/pycdlib/headervd.py", line 809, in pvd_factory
    pvd.new(0, sys_ident, vol_ident, set_size, seqnum, log_block_size,
  File "/srv/shakenfist/venv/lib/python3.9/site-packages/pycdlib/headervd.py", line 318, in new
    self.volume_creation_date.new(now)
  File "/srv/shakenfist/venv/lib/python3.9/site-packages/pycdlib/dates.py", line 256, in new
    self.date_str = time.strftime(self.TIME_FMT, local).encode('utf-8') + '{:0<2}'.format(self.hundredthsofsecond).encode('utf-8') + struct.pack('=b', self.gmtoffset)
struct.error: byte format requires -128 <= number <= 127

In fact when this is happening, calling iso.new() is sufficient to get the stack trace. While debugging this problem the value of time.time() changed for obvious reasons, and now I am no longer getting the stack trace.

mikalstill commented 1 year ago

I guess I should say because GMT offset is mentioned on the offending line, that I am in Canberra, Australia where it is already 2023.

mikalstill commented 1 year ago

I've isolated the problem further:

Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import struct
>>> from pycdlib import utils
>>> 
>>> epoch = time.time() - 3600
>>> print(epoch)
1672529432.1585703
>>> 
>>> offset = utils.gmtoffset_from_tm(epoch, time.localtime(epoch))
>>> print(offset)
-34996
>>> 
>>> struct.pack('=b', offset)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: byte format requires -128 <= number <= 127
mikalstill commented 1 year ago

So I think ultimately the bug is in https://github.com/clalancette/pycdlib/blob/master/pycdlib/utils.py#L204, which should definitely not be returning a large negative number. The ISO specification seems to think this should be an unsigned 8 bit field https://wiki.osdev.org/ISO_9660#Date.2Ftime_format.

lod commented 1 year ago

I created a minimal reproduction and some exploration in https://gist.github.com/lod/943c47a48c2706905afc985887be869f

tmpyear = gmtime.tm_year - local.tm_year

When the local time is in the year ahead of utc this returns -1 Which then results in an offset = -34996 Which then makes struct sad

I explored a bit, but I think the core issue is that doing maths on dates sucks, there's always corner cases. You are handling two, this is a third, there are probably more.

https://github.com/clalancette/pycdlib/pull/108 rewrites the function to use seconds based maths, this fixes this bug/corner case and all the others.

This is the same code as the third function in the gist.

utkonos commented 1 year ago

Happy New Year!

I'm also hitting this issue. Here is some PDB showing the contents of self.gmtoffset:

>>> import pycdlib
>>> iso = pycdlib.PyCdlib()
>>> iso.new(rock_ridge='1.09', vol_ident='CIDATA')
> /test/venv/lib/python3.11/site-packages/pycdlib/dates.py(257)new()
-> self.date_str = time.strftime(self.TIME_FMT, local).encode('utf-8') + '{:0<2}'.format(self.hundredthsofsecond).encode('utf-8') + struct.pack('=b', self.gmtoffset)
(Pdb) self.gmtoffset
172
(Pdb) type(self)
<class 'pycdlib.dates.VolumeDescriptorDate'>
clalancette commented 1 year ago

Thank you so much for reporting this, and also doing the great investigation! That really helped narrow the problem down and make it easier to debug and write a test for. I ended up doing a different fix than in #108; in particular, I kept using the tm structures, and just rewrote it to fix it and make it work. You can see my fix in 20e3b8e8327a8b9c396cac5b5a19ced67c3f8132 .

I'd appreciate any testing you can do on that commit. I'm going to close this out as completed for now, but if that commit doesn't fix it for you, please feel free to reopen.

mikalstill commented 1 year ago

Thanks for the fix. I'll try and get some testing done later today. Do you have any feel for when you're likely to release next? I guess the urgency is off until next year, but it would be nice to force a dependancy on the new version while its fresh in everyone's mind.

clalancette commented 1 year ago

Once I get confirmation that it fixes the problem for you, I'll consider doing a release. Thanks again!

utkonos commented 1 year ago

@clalancette Works for me. But I didn't open the issue.

mikalstill commented 1 year ago

Yep, I agree this works. Thanks for the quick fix!