dcantrell / pyparted

Python bindings for GNU parted (libparted)
GNU General Public License v2.0
85 stars 42 forks source link

Expose new partition type ID and UUID accessors #95

Closed berrange closed 1 year ago

berrange commented 1 year ago

The parted 3.5 release introduced new APIs for getting/setting the partition type ID and UUID metadata. This in turn needs to be exposed in pyparted.

This unlocks ability for tools like blivet to then support the systemd discoverable partition spec by setting suitable GPT UUIDs on partitions.

I wasn't clear on what policy pyparted takes wrt back-compatibility with older parted releases. Since commit 07ba882d04fa2099b53d41370416b97957d2abcb conditionally exposed the new enum constants, I tried to follow its lead and conditionally expose the new APIs too.

berrange commented 1 year ago

Now re-pushed with some extensive unit test coverage too

berrange commented 1 year ago

I was also seeing FPE crashes randomly when using a newer scratch build if libparted, but I'm having trouble tracking that down and can't reproduce it even once with the shipped version so that may not be a problem with this PR but I want to be sure.

I just tried a git master of libparted with this series and i didn't see anything odd from pytest

bcl commented 1 year ago

I appear to be stuck. I can somewhat consistently make this crash but I cannot figure out what's causing it. I setup a new rawhide vm, with the rawhide release of parted-3.5-6.fc38.x86_64, installed the build deps for pyparted, copied over the pull/95 pyparted code to a directory and then ran:

[root@fedora pyparted]# make clean && rm -rf tests/__pycache__ && make test

It passes all the test and then coredumps with a FPE:

Ran 281 tests in 0.642s                                                                                                                                                                                            

OK (skipped=112)                                                                                                                                                                                                   
make: *** [Makefile:42: test] Floating point exception (core dumped)                         

I have also seen it dump in the middle of the test, but usually it happens at the end.

Program terminated with signal SIGFPE, Arithmetic exception.                                                                                                                                                       
#0  0x00007f49621eed03 in ped_div_round_up (divisor=0, numerator=100)                                                                                                                                              
    at labels/../../include/parted/natmath.h:130                                                                                                                                                                   
130             return (numerator + divisor - 1) / divisor;                                                                                                                                                        
Missing separate debuginfos, use: dnf debuginfo-install python3-3.11.0~rc2-1.fc38.x86_64                                                                                                                           
(gdb) up                                                                                                                                                                                                           
#1  gpt_alloc_metadata (disk=0x5562658c9f90) at labels/gpt.c:1509                                                                                                                                                  
1509      gptlength = ped_div_round_up (sizeof (GuidPartitionTableHeader_t),                                                                                                                                       
(gdb) list                                                                                                                                                                                                         
1504      PED_ASSERT (disk != NULL);                                                                                                                                                                               
1505      PED_ASSERT (disk->dev != NULL);                                                                                                                                                                          
1506      PED_ASSERT (disk->disk_specific != NULL);                                                                                                                                                                
1507      gpt_disk_data = disk->disk_specific;                                                                                                                                                                     
1508                                                                                                                                                                                                               
1509      gptlength = ped_div_round_up (sizeof (GuidPartitionTableHeader_t),                                                                                                                                       
1510                                    disk->dev->sector_size);                                                                                                                                                   
1511      pteslength = ped_div_round_up (gpt_disk_data->entry_count                                                                                                                                                
1512                                     * sizeof (GuidPartitionEntry_t),                                                                                                                                          
1513                                     disk->dev->sector_size);                                                                                                                                                  
(gdb) p *disk                                                                                                                                                                                                      
$1 = {dev = 0x556265850800, type = 0x7f4962214900 <gpt_disk_type>, block_sizes = 0x71306f666a6f6d2d,                                                                                                               
  part_list = 0x0, disk_specific = 0x556265948020, needs_clobber = 1, update_mode = 1}                                                                                                                                                                                                                                                                                                    
(gdb) p *disk->dev                                                                                                                                                                                                 
$2 = {next = 0x1, model = 0x8ffffffff <error: Cannot access memory at address 0x8ffffffff>, path = 0x0,                                                                                                            
  type = PED_DEVICE_UNKNOWN, sector_size = 0, phys_sector_size = 33, length = 93901736590848,                                                                                                                      
  open_count = 0, read_only = 0, external_mode = 0, dirty = 0, boot_dirty = 49, hw_geom = {                                                                                                                        
    cylinders = 0, heads = 1886221359, sectors = 1835365423}, bios_geom = {cylinders = 1701064048,                                                                                                                 
    heads = 1701013878, sectors = 812790573}, host = 28260, did = 12398, arch_specific = 0x77}           

This happens while python is deallocating the objects, it looks like *dev got freed first and then reused/clobbered by something else since those numbers make no sense. This may be happening all the time, but if sector_size isn't zero it doesn't hit the FPE so nobody notices.

I was suspicious of the reopen functions, but removing them doesn't help -- it will still crash. Switching to the master version of pyparted doesn't crash (or at least hasn't yet in over 20 attempts).

I've stared at the parted changes since 3.5 and at this PR and I do not see anything wrong with them.

bcl commented 1 year ago

I am pretty sure I tracked it down. The fix is #97, basically don't call any of the device_free_all methods because you can't guarantee there isn't some python object still pointing to it.

Found it by running the tests with valgrind which helpfully pointed out that it was using an int from a block of memory that had been previously freed.

So, with that change, I think this PR is ok :)

berrange commented 1 year ago

Oh, that's really very subtle. The fun of native python bindings ! Thanks for figuring that out

berrange commented 1 year ago

@dcantrell any thoughts on this patch series ? I'm especially interested in your POV of the conditional usage of new parted APIs, since I didn't see strong existing precedent for that, vs blindly assuming the latest parted release is always present.

dcantrell commented 1 year ago

I've read through this and ran the tests locally for my own comfort. I like this patch and I feel it's an improvement over the previous libparted API support. Regarding policy of backwards compatibility....it's been sort of a best effort type thing. When I was originally working on pyparted, I was also upstream for GNU parted so I coordinated with myself. That's all changed now and honestly having pyparted be slightly more robust at runtime is an improvement.

Thanks. I have merged PR#97 and since that was the one that @bcl and I hit locally, I think this is ready to merge.