python / cpython

The Python programming language
https://www.python.org
Other
63.54k stars 30.44k forks source link

Strange 3.14.0a1 problem in attribute assignment #125968

Open replabrobin opened 3 weeks ago

replabrobin commented 3 weeks ago

Bug report

Bug description:

        # bold, italic
        _ = tt2ps(frag.fontName,frag.bold,frag.italic)
        if _=='helvetica': breakpoint()
        frag.fontName = _
        if frag.fontName=='helvetica': breakpoint()

the code above breaks at the second breakpoint. The value of _ is 'Helvetica', but the attribute value is 'helvetica' (the original value). It's not clear if the assignment is carried out or somehow transformed to lower case. The frag class is ParaFrag(ABag): pass with ABag defined as

class ABag:
    """
    'Attribute Bag' - a trivial BAG class for holding attributes.

    This predates modern Python.  Doing this again, we'd use a subclass
    of dict.

    You may initialize with keyword arguments.
    a = ABag(k0=v0,....,kx=vx,....) ==> getattr(a,'kx')==vx

    c = a.clone(ak0=av0,.....) copy with optional additional attributes.
    """
    def __init__(self,**attr):
        self.__dict__.update(attr)

    def clone(self,**attr):
        n = self.__class__(**self.__dict__)
        if attr: n.__dict__.update(attr)
        return n

    def __repr__(self):
        D = self.__dict__
        K = list(D.keys())
        K.sort()
        return '%s(%s)' % (self.__class__.__name__,', '.join(['%s=%r' % (k,D[k]) for k in K]))

the _ intermediate variable is not actually used in the original code, but the original and this code works as expected in python 3.9-3.13.

Original code is https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/platypus/paraparser.py (line 3131) and https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/lib/abag.py

CPython versions tested on:

3.14

Operating systems tested on:

Linux

skirpichev commented 3 weeks ago

Could you please try to provide a minimal example? The "original code" seems unavailable, what's tt2ps()?

replabrobin commented 3 weeks ago

tt2ps is just a function that returns a string. It look like

#......
#maps a piddle font to a postscript one.
_tt2ps_map = {
            #face, bold, italic -> ps name
            ('times', 0, 0) :'Times-Roman',
            ('times', 1, 0) :'Times-Bold',
            ('times', 0, 1) :'Times-Italic',
            ('times', 1, 1) :'Times-BoldItalic',

            ('courier', 0, 0) :'Courier',
            ('courier', 1, 0) :'Courier-Bold',
            ('courier', 0, 1) :'Courier-Oblique',
            ('courier', 1, 1) :'Courier-BoldOblique',

            ('helvetica', 0, 0) :'Helvetica',
            ('helvetica', 1, 0) :'Helvetica-Bold',
            ('helvetica', 0, 1) :'Helvetica-Oblique',
            ('helvetica', 1, 1) :'Helvetica-BoldOblique',

            # there is only one Symbol font
            ('symbol', 0, 0) :'Symbol',
            ('symbol', 1, 0) :'Symbol',
            ('symbol', 0, 1) :'Symbol',
            ('symbol', 1, 1) :'Symbol',

            # ditto for dingbats
            ('zapfdingbats', 0, 0) :'ZapfDingbats',
            ('zapfdingbats', 1, 0) :'ZapfDingbats',
            ('zapfdingbats', 0, 1) :'ZapfDingbats',

            ('zapfdingbats', 1, 1) :'ZapfDingbats',
            }
#..........
def tt2ps(fn,b,i):
    'family name + bold & italic to ps font name'
    K = (fn.lower(),b,i)
    if K in _tt2ps_map:
        return _tt2ps_map[K]
    else:
        fn, b1, i1 = ps2tt(K[0])
        K = fn, b1|b, i1|i
        if K in _tt2ps_map:
            return _tt2ps_map[K]
    raise ValueError("Can't find concrete font for family=%s, bold=%d, italic=%d" % (fn, b, i))

it is defined in https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/lib/fonts.py so far as I can tell the orginal frag.fontName == 'helvetica' and K==('helvetica',0,0) should be in the keys so tt2ps should return 'Helvetica' which it appears to do correctly. The problem arises in the assignment somehow.

As for a minimal example, I have tried to reduce the problem, but it seems to be a Heisenbug and only appears when I run my full test set.

skirpichev commented 3 weeks ago

it is defined in https://hg.reportlab.com/hg/reportlab/file/tip/src/reportlab/lib/fonts.py

Again, this source seems unavailable. At least for me, it shows HTTP 401 error.

Please provide self-contained code to reproduce your problem.

replabrobin commented 3 weeks ago

I improved the debugging somewhat using this code


        # bold, italic
        print(f'1:{frag.fontName=} {id(frag.fontName)=}')
        _ = tt2ps(frag.fontName,frag.bold,frag.italic)
        print(f'{_=} {id(_)=}')
        if _=='helvetica': breakpoint()
        frag.fontName = _
        if frag.fontName=='helvetica':
            print(f'2:{frag.fontName=} {id(frag.fontName)=}')
            breakpoint()

when I run the failing tests I see this output

.........
1:frag.fontName='helvetica' id(frag.fontName)=127745020596784
_='Helvetica' id(_)=127745023195376
1:frag.fontName='helvetica' id(frag.fontName)=127745020596784
_='Helvetica' id(_)=127745023195376
1:frag.fontName='helvetica' id(frag.fontName)=127745020596784
_='Helvetica' id(_)=127745023195376
1:frag.fontName='helvetica' id(frag.fontName)=127745020596784
_='Helvetica' id(_)=127745023195376
2:frag.fontName='helvetica' id(frag.fontName)=127745020596784
> /home/robin/devel/reportlab/reportlab/platypus/paraparser.py(3138)handle_data()
-> breakpoint()
(Pdb) 

showing repeated correct results followed by a failure. The frag attribute is unchanged ie has the original id of the lower case string 'helvetica'. Could this be some hotness optimization happening?

replabrobin commented 3 weeks ago

Really sorry got the source urls wrong should be https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/platypus/paraparser.py https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/lib/abag.py https://hg.reportlab.com/hg-public/reportlab/file/tip/src/reportlab/lib/fonts.py

ericvsmith commented 3 weeks ago

By "self contained code", @skirpichev means something we can run without downloading any other code. In other words, simplify your example to something we can copy and paste into Python running on our systems. Maybe you can inline the functions you're calling, or remove the function call and just do the assignment directly.

replabrobin commented 3 weeks ago

I've tried several times to make a smaller example. It is so variable that I tried replicating on another hardware intel --> amd. The problem persists. I put a global counter into the debug output and the number of passes through this code does vary. In addition it sometimes manages to pass all the tests without error. It's a real Heisenbug. This python is in a vm based off a self build from source. Perhaps I have not got the correct config. I'll continue investigating and trying to make a self contained version.

skirpichev commented 3 weeks ago

Sorry, we still have no idea which code you actually run. What tests do? Can you minimize number of tests? Can you reproduce this without pytest (or something else you are using)?

replabrobin commented 3 weeks ago

FWIW I am running this statement in the top level of the reportlab repository

~/devel/reportlab/REPOS/reportlab
$ (cd tests && /home/robin/devel/reportlab/.py314/bin/python runAll.py --failfast)

when I run this with my patched code I see

1:frag.fontName='helvetica' id(frag.fontName)=136384627218352 210
_='Helvetica' id(_)=136384629815856
1:frag.fontName='helvetica' id(frag.fontName)=136384627218352 211
_='Helvetica' id(_)=136384629815856
2:frag.fontName='helvetica' id(frag.fontName)=136384627218352
> /home/robin/devel/reportlab/reportlab/platypus/paraparser.py(3142)handle_data()
-> breakpoint()
(Pdb)

or sometimes the count is 159 or it passes through to cause an error.