Open hobbes1069 opened 4 years ago
Couple of other ideas:
Ok, so basically create multiple versions of the handful of functions and use runtime detection in FreeDV to know which to use?
Couple of ways:
However as per my previous mailing list posts - just because we can doesn't mean we should...
I still have strong misgivings about pushing this out into the wild. It will be a support nightmare, as we won't know which machines will run and which won't. Think I'd feel better if we had some automation to determine if it will run in real time.
Also - the vast majority of users run Windows, so we need a cross platform way to handle this.
I don't disagree but with this being developed in the open, there is already people using it even if it's not ready for "prime time" yet. We do know there is interest so I think trying to solve these technical issues sooner rather than later are still in our best interest.
Sure, happy to keep brainstorming and I'd like to see the technical issues solved too - in a way that minimises support/and helps the end users.
Some suggested tasks:
Re:
We do know there is interest
I've seen recent interest from package maintainers, but have you had any interest specifically from end users? I was wondering if there are records of the number of times freedv-gui packages are installed for example?
I've had several Windows users complain they can't use 2020 because of AVX.
Well specifically we know quisk is interested in supporting the mode, which is the current driver for packaging it. As far as windows, yes that's unfortuante but it seems AVX is the best basline we've found. AVX2 provided very limited benefit.
Please also keep in mind that one SSE version is not enough: just like AVX (AVX1) and AVX2 are two separate things, SSE (SSE1) intrinsics are not the same thing as SSE2 intrinsics nor SSE3 intrinsics nor SSE4.1 intrinsics. See my comments in #25: https://github.com/drowe67/LPCNet/pull/25#issuecomment-620915421
@kkofler thanks for your comments, they will be useful the next time some one works on this code. Several other tasks we need to resource too, as detailed above,
@hobbes1069 I wonder if it's time to revisit this again? SSE support would open up FreeDV 2020 to many more people if it can be managed.
@tmiw I would be interested in your thoughts.
Key issue for me is to avoid end user problems i.e. "it doesn't work" bug reports because they are using a machine that doesn't have the CPU/SIMD power.
I am open to ideas on how we handle that :slightly_smiling_face:
@hobbes1069, @drowe67, I'm thinking single library would be best from a distribution perspective.
That said, how far back are we expecting to support hardware-wise? I know for the macOS version of FreeDV, for instance, we only support 64-bit Intel and ARM (and even then, we only go as far back as macOS 10.11, which AFAIK only supports Apple machines with AVX/SSE).
That said, how far back are we expecting to support hardware-wise? I know for the macOS version of FreeDV, for instance, we only support 64-bit Intel and ARM (and even then, we only go as far back as macOS 10.11, which AFAIK only supports Apple machines with AVX/SSE).
That's a good question. I'd suggest not very far back, or based on what we can handle with AVX/SSE (current flavor) and None (no acceleration) based on a simple speed test and a reasonable amount of development. During 700D development I discovered quite a few Hams with very old (XP era) hardware, we can probably rule that out.
I know on Fedora x86_64 assumes SSE1 is available and is no longer distributing a 32bit install (but 32bit binaries and libraries are still available if needed). Of course this is all stuff we can or can't assume at build time. There are ways to dynamically test for and use the various instruction sets at program launch but the implementation is certainly beyond me.
Fedora actually assumes SSE2 on x86 these days (even for the 32-bit multilibs). (That also implies that the older extensions, i.e., MMX and SSE1, can be assumed as well.) But SSE3 and higher (including any level of AVX) still have to be detected at runtime (or disabled entirely) in Fedora binaries.
I know on Fedora x86_64 assumes SSE1 is available and is no longer distributing a 32bit install (but 32bit binaries and libraries are still available if needed). Of course this is all stuff we can or can't assume at build time. There are ways to dynamically test for and use the various instruction sets at program launch but the implementation is certainly beyond me.
For reference, here's how it's currently done in freedv-gui. Granted, that's only for AVX and not SSE, but there's also this page from Microsoft that shows how to get the others.
@kkofler so my memory failed me. I wish this was better documented somewhere.
Ok, while trying to figure out crazy and mostly unworkable ways to make this work for both developers and distros I came up with this idea:
Build the lpcnetfreedv library multiple times with and without optimizations. Because of its size, it would be good to separate the nndata into a separate library unless you think that will hurt performance. Otherwise that's tens of MB added to each library.
Then FreeDV could check for multiple libraries and load the "best" candidate.
I'm also experimenting with a different method of downloading the nnet data by creating a custom target so you can "make download" or it will download on the fly during make instead of during configuring.