fkodom / fft-conv-pytorch

Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.
MIT License
478 stars 58 forks source link

License #23

Closed hmaarrfk closed 1 year ago

hmaarrfk commented 1 year ago

I'm trying to package this for conda-forge, any information on the license?

Thank you

Best,

Mark

xref: https://github.com/yoyololicon/fft-conv-pytorch/issues/9

fkodom commented 1 year ago

@hmaarrfk All of my projects are MIT license https://github.com/fkodom/fft-conv-pytorch/blob/master/LICENSE

fkodom commented 1 year ago

Why do you need to package it for conda-forge? This library is already pip installable:

hmaarrfk commented 1 year ago

Sorry, i was blind. I guess the project i'm using is a fork of yours, I was presumed that they had also included your license https://github.com/yoyololicon/fft-conv-pytorch/

maybe it wasn't forked when you added your license.

hmaarrfk commented 1 year ago

Why do you need to package it for conda-forge? This library is already pip installable:

While you can use pip+conda together having everything managed by conda helps with updates and maintainability.

conda(-forge) really helps manage packages that have C dependencies between them. For example, you can have opencv depend on ffmpeg and HDF5.

Ultimately, if you are to do this with pip, you need to have every single pip package recompile HDF5 itself to built it in with HDF5 support.

in my opinion, conda(-forge), lowers the barrier to integrating C libraries with python, making it an overall stronger scientific development environment.

Small example:

mamba create --name opencv opencv python=3.10
In [1]: import cv2
In [2]: cv2.hdf
<module 'cv2.hdf'>
pip install opencv-python
In [1]: import cv2

In [2]: cv2.hdf
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 cv2.hdf

AttributeError: module 'cv2' has no attribute 'hdf'

Again, Thank you for pointing me to the obvious place where your license would be.... sorry for the bother on that front.

Best,

Mark

fkodom commented 1 year ago

Interesting -- is that a specific issue with the Conda opencv package? I have installed and used opencv-python, ffmpeg, and h5py together with pip many times. Maybe the cv2.hdf module does something additional that I'm missing?

I suppose this is a more generic request than cv2.hdf, though. I'm curious how often this is an issue in practice?

hmaarrfk commented 1 year ago

Note that the conda opencv package supports cv2.hdf , while the pip does not.

Using pip, you get to to communicate between all these different modules through python.

So your data must pass effectively through something like a numpy array (this isn't so bad, its just a wrapper around a strided piece of data)

So you can load your data through h5py, coerce it into a numpy array, then pass it to opencv-python.

But if you have more control over how opencv is built for your application, you could for example simply ask opencv to open the HDF5 directly. The HDF5 usecase below is somewhat minor. but you can think of it as "as feature of opencv was traded off to simplify packaging, or to make the installable smaller".

So the recourse is a few fold:

  1. Ask nicely for the person that packaged opencv for pip to include HDF5 support. This will take a while.
  2. Use the conda-forge package if it supports what you need
  3. Create your own conda package compiled with the options you need and upload it to your conda channel.

This last option, is only available with conda, where the concept if channels exists from the start. I know that pip can have sources, but the website and infrastructure created by Anaconda (or binstar) makes it really easy for others to create their own channels.

I'm curious how often this is an issue in practice?

This is difficult to measure. Indeed, going with route 1 tends to in the long term, work out, but I'm guilty of recompiling programs to be compatible with lesser used features for performance enhancements while upstream takes action.

The ability to have your own (hosted) channel gives you "patience"

One concrete example, is the availability of codecs, I used this little snippet of code to test things: https://stackoverflow.com/a/76173072/2321145

```python import cv2 from pprint import pprint def is_fourcc_available(codec): try: fourcc = cv2.VideoWriter_fourcc(*codec) temp_video = cv2.VideoWriter('temp.mkv', fourcc, 30, (640, 480), isColor=True) return temp_video.isOpened() except: return False def enumerate_fourcc_codecs(): codecs_to_test = ["DIVX", "XVID", "MJPG", "X264", "WMV1", "WMV2", "FMP4", "mp4v", "avc1", "I420", "IYUV", "mpg1", "H264"] available_codecs = [] for codec in codecs_to_test: available_codecs.append((codec, is_fourcc_available(codec))) return available_codecs if __name__ == "__main__": codecs = enumerate_fourcc_codecs() print("Available FourCC codecs:") pprint(codecs) ```

Pip:

Available FourCC codecs:
[('DIVX', True),
 ('XVID', True),
 ('MJPG', True),
 ('X264', False),
 ('WMV1', True),
 ('WMV2', True),
 ('FMP4', True),
 ('mp4v', True),
 ('avc1', False),
 ('I420', True),
 ('IYUV', True),
 ('mpg1', True),
 ('H264', False)]

conda-forge:

Available FourCC codecs:
[('DIVX', True),
 ('XVID', True),
 ('MJPG', True),
 ('X264', True),
 ('WMV1', True),
 ('WMV2', True),
 ('FMP4', True),
 ('mp4v', True),
 ('avc1', True),
 ('I420', True),
 ('IYUV', True),
 ('mpg1', True),
 ('H264', True)]

How valuable is using x264, avc1, h264 through the opencv API for you? That only you can decide.

But the challenge of using opencv + ffmpeg, or opencv + hdf5, or any pairwise combination (dare I say 3 library combination!) is very difficult in my experience, to do using just pip.

fkodom commented 1 year ago

Ok, I think I understand. Sounds like conda "channels" effectively take the place of a private PyPI server (or pip source)?

For my understanding -- could the issue also be solved by building from source? If so, that feels more familiar to me. (My instinct is to build from source, and containerize anything that needs to be reused or run on a remote machine.)

hmaarrfk commented 1 year ago

you got it!