Open pentschev opened 3 years ago
Some reorganizing seems reasonable and should help with upstreaming.
Was this all done in PR ( https://github.com/rapidsai/ucx-py/pull/711 ) or is there still pending work?
Also as to binary size, this in part a consequence of Cython including some repeated symbols in different libraries. Please see issue ( https://github.com/cython/cython/issues/2356 ) for context. There are some things we can do in terms of compiler flags, but there are some things that are out-of-our-hands
Some reorganizing seems reasonable and should help with upstreaming.
Was this all done in PR ( #711 ) or is there still pending work?
I think #711 was already a big chunk of that work, followed by #716 , we don't anymore have any files in the "core" library that are over 300 lines, with the only exception being https://github.com/rapidsai/ucx-py/blob/3111d97421f8df8f2ebb0042a84800e2b1b5ced4/ucp/_libs/ucx_api_dep.pxd which includes all the C prototypes only, currently just under 500 lines. In any case, IMO this is already enough for the upstreaming effort, as it matches UCX's limit of 500 lines of code per PR from the General Guidelines for Contributors.
The reason I opened this issue was to discuss whether it's worth creating a new Extension
per-class, or just including them all in a single Extension
as done in https://github.com/rapidsai/ucx-py/blob/3111d97421f8df8f2ebb0042a84800e2b1b5ced4/ucp/_libs/ucx_api.pyx now. Particularly, I think that would increase both code and binary complexities without any clear advantage, so I'm not sure it's worth doing so.
Also as to binary size, this in part a consequence of Cython including some repeated symbols in different libraries. Please see issue ( cython/cython#2356 ) for context. There are some things we can do in terms of compiler flags, but there are some things that are out-of-our-hands
Thanks for referencing that issue. To me it doesn't seem like we have a much better alternative to how we could improve UCX-Py's core library to what it currently is.
Based on what I wrote above, do you have any opinions on whether we should do something different (e.g., one Extension
per class) or if it looks reasonable in its current state?
To make the code more readable and later make it easier to upstream, we have split
ucx_api.pyx
into multiple files in https://github.com/rapidsai/ucx-py/pull/711 . Each file is included intoucx_api.pyx
to keep the model ofucx_api
being the only extension/module/.so
file and thus be a non-breaking API change. Another alternative is to add a new.pxd
file for each of the newly created.pyx
files, but this requires creating a new CythonExtension
, which leads to a new.so
file for each extension. I did such a change in this branch, but this also leads to an increase of almost 3x in binary size:Are there other reasons why we should or should not do as above?
cc @jakirkham @madsbk