Closed lucascolley closed 1 year ago
There is still more work to be done in this case since
scipy.fft
has some keywords which are not in the standard API:overwrite_x
,workers
andplan
. I am unsure of how the handling will work when these arguments are provided, but perhaps providing suitable warnings or exceptions will suffice.
I agree with your last thought - those keywords can remain numpy-only, so if they're used in combination with a PyTorch/CuPy array then raising an exception straight away sounds good to me.
2. There seem to be two current solutions within the module: one which uses
uarray
to dispatch to the appropriate library, and another (in_pocketfft
) which converts to numpy arrays and usespypocketfft.cxx
. I would be grateful for clarification about how would be best to deal with this case.
Initial thoughts: Uarray as a dispatcher for numpy arrays coming in makes sense, but for non-numpy arrays it doesn't quite anymore. While for array API handling we only care about non-numpy arrays. So these two cases can hopefully be cleanly separated. For non-numpy arrays that don't have a direct equivalent that can be called (e.g. cupy.fft.fft), I think the way to go is the pattern for compiled code from Pamphile's cluster
PR: convert to numpy, go through pocketfft, convert back to the original array type and return that.
Thanks for the comments Ralf. Taking scipy.fft.fft
as an example, how does this look?
Thank you for the write up and the flowchart, this is helpful đ
I am not sure about the last part about converting to NumPy array and falling back. That implies a silent conversion and potentially some expensive operations (moving the array from GPU to CPU and then back to the GPU). I think we should error out here and say that the namespace is not compatible. i.e. ask users to do the conversion on their side.
Yeah, exact same thoughts as @tupui said.
I think we should error out here and say that the namespace is not compatible
I don't quite agree - what I meant is to use the regular pattern you established in the cluster
PR:
np.asarray
xp.asarray
(1) should anyway raise an exception for non-CPU arrays, however for non-numpy CPU arrays it will work. And, this is also the same pattern to be used for functions that aren't part of the array API standard.
Ah right agreed đ Lucas, this is for things like torch-cpu for instance where a PyTorch tensor is on the CPU and not the GPU. So it's a no-cost operation to convert a tensor into a NumPy array.
scipy.fft
scipy.fft
currently works on the assumption that its inputs are NumPy arrays. Passing in non-NumPy arrays can cause errors, e.g. (where scipy
is imported as sp
and cupy
is imported as cp
)
>>> sp.fft.fft(cp.exp(2j * cp.pi * cp.arange(8) / 8))
[...]
KeyError: 'ALIGNED is not defined for cupy.ndarray.flags'
The module currently uses uarray
to dispatch to different backends and has pypocketfft
as the standard backend.
In order to make scipy.fft
conform to the Array API standard, three design options have been suggested for the solution. Each of these options aims to deal with NumPy arrays and non-NumPy arrays separately, and for non-NumPy arrays which already implement fft
, to use their implementation. This is visualised for scipy.fft.fft
in the above diagram.
uarray
dispatch structureThis option would leave the current function signatures alone and modify their contents such that non-NumPy arrays are handled correctly.
I don't know how to do this. It is not as simple as adding cases for non-NumPy arrays within each function as the arrays still get caught up in the uarray
dispatch structure. Perhaps this option is feasible with only some minor modifications to the dispatch structure, but I am unsure.
This option proposes to rename the current functions and replace them with new functions to handle all of the required array types. The new functions would be written to match the SciPy API and would call the renamed functions in the case of a NumPy array.
For details of the progress which I have made in the code so far, and where I am stuck, please see PR #27.
This option proposes to make the current fft
module a subpackage of the replacement fft
module. The replacement would be written to conform to both the Array API and - at least in the case of NumPy arrays - the SciPy API. It would use the subpackage for NumPy arrays and handle non-NumPy arrays without using the uarray
dispatch structure at all.
For details of the progress which I have made in the code so far, and where I am stuck, please see PR #28.
The options discussed above are far from fleshed out and it is very plausible that a good design for the solution will take a form which is very different to any of the options, but hopefully they will be useful in focusing our thoughts on how to start implementing a solution. Discussion about the options above (and any alternatives), critique of assumptions I have made and pointing out any misunderstandings are more than welcome (as well as critique of or fixes to the code changes in PRs #27 and #28 of course)!
This option proposes to rename the current files which contain the methods by adding the suffix _np
, replacing them with files of the original name which handle all array types. These new files will import the _np
files and use the current methods just in the case of NumPy arrays.
For details of the progress which I have made in the code so far, please see PR #29.
This seems like the most promising option so far. If we agree that this is the way to go, I will make a new PR taking onboard the feedback from this PR then close all of the design option PRs.
update: we seem to have settled on design option 4, so I have closed the design option PRs and opened a new PR on my fork: https://github.com/lucascolley/scipy/pull/2
I intend this new PR to be a draft for the full PR which will eventually be opened on the main repo. Cheers! :)
closing now to direct all further discussion to the PR on the main repo: https://github.com/scipy/scipy/pull/19005
Is your feature request related to a problem? Please describe.
Array API standard RFC: https://github.com/scipy/scipy/issues/18286 PR adding Array API support and covering
scipy.cluster
: https://github.com/scipy/scipy/pull/18668This issue proposes to extend the Array API support of the above PR to also cover
scipy.fft
, with a view to doing the same for more modules in the future. This will allowscipy.fft
to use arrays from CuPy, PyTorch and other array/tensor libraries.fft
is an array API standard extension module and has matching APIs in both CuPy and PyTorch. The goal is to fully coverscipy.fft
to conform to the Array API standard, as specified in the Array API standard documentation:Describe the solution you'd like.
The solution should mostly follow the patterns set out in the above PR.
scipy.fft
provides some extra functions which are not included in the standardfft
API:fft2
,ifft2
,rfft2
,irfft2
,hfft2
,ihfft2
,hfftn
,ihfftn
. It also provides other functions for Discrete Sin and Cosine Transforms (DST and DCT) and Fast Hankel Transforms, as well as other helper functions and backend control functions.Those functions which are included in the standard should be the most straightforward to implement:
fft
extension, then itsfft
can be used as set out in the standard:There is still more work to be done in this case since
scipy.fft
has some keywords which are not in the standard API:overwrite_x
,workers
andplan
. I am unsure of how the handling will work when these arguments are provided, but perhaps providing suitable warnings or exceptions will suffice.fft
extension, I assume that the implementation will fall back on what is currently within thescipy.fft
module. I do not know whether this is an accurate assumption, or if it is, how exactly it would work. There seem to be two current solutions within the module: one which usesuarray
to dispatch to the appropriate library, and another (in_pocketfft
) which converts to numpy arrays and usespypocketfft.cxx
. I would be grateful for clarification about how would be best to deal with this case.I believe that
array_namespace
from the PR will either error out or produce something which can be handled by one of the two cases above, but please correct me if I have got this wrong.As for those functions which are not included in the standard, I am unsure as to whether changing their implementations is within the scope of this issue (it may very well be). Regardless, there may be some work involved to ensure that they still work as intended alongside the standard functions. I would be grateful for clarification as to how best to approach these other functions.
On a more general note, I am unsure as to how this work will fit in to the module with the existing
uarray
and_pocketfft
code. Any explanation of how to go about adding this support without messing with the existing functionality would be greatly appreciated!Describe alternatives you've considered.
No response
Additional context (e.g. screenshots, GIFs)
I hope that my description of the solution above is along the right lines, but it may be full of misunderstandings! Please correct anything that I haven't got quite right as it would be useful to get rid of any misconceptions from the start!