Closed jserv closed 2 years ago
Hi Jim,
Thank you for reaching out.
This repository of mine contains many prototypes, creative experiments/apps and library code. I think I just copied the file from one place to another without thinking too much about consolidating it into one place. I will consider adding a submodule reference to its origin.
I was missing _MM_TRANSPOSE when I was porting some of my SSE code to ARM using SSE2NEON. I see that's been implemented now. Nice! I actually went ahead and researched how to do a 4x4 floating point transpose using NEON interleaved/transpose load intrinsics, but realized it wasn't so obvious. In the end I found an implementation, which I added here, https://github.com/marcel303/framework/blob/5c589ed6bd7c7bc8dc6f3c7ecde495bfee660db3/1stparty/binaural/neon-transpose.h#L40 Source: http://tessy.org/wiki/index.php?NEON%A4%C732bit%A4%CE%C5%BE%C3%D6 This version is very similar, if not identical to yours!
As for use cases.. I mostly use it for the binauralization library I wrote. See: https://github.com/marcel303/framework/tree/master/1stparty/binaural
It performs FFTs on four audio buffers in parallel (using SSE/NEON), and performs convolution with the left/right binaural HRTFs in parallel. I wanted it to run fast on ARM hardware, since that is what the Oculus Quest uses internally.
The second use case is for some water/wave simulation that runs at audio rate, for some weird but convincingly physical sounding audio synthesis.
Cheers, Marcel
fully unsable pile of useless code. killed 6 hours tying even run just single demo
Hi @ZalgoSoft, which platform / OS did you try to build for?
Hi @ZalgoSoft, which platform / OS did you try to build for?
windows 10 64 VS2019 community. I'm looking for flexible node flow engine which will suport and process high data flow , about 10 MSPS For now I run ImNodes wich is part of ImGUI and some time ago discovered DirectShow code library, wich is acceptable for me. What I really need is a good real time data flow manager/orchestrator/arbiter with circular buffers etc. I wrote my own lightweight data flow framework similiar to directshow and Jack but I lack ability to add nodes, heh.
Hi @ZalgoSoft,
The first problem you probably encountered is that some of the libraries depend on 32 bit statically compiled binaries (i.e. ff peg/avcodec), and no 64 bit version exists. I tried to change the generate script to tell Cmake to produce a project file targeting x86. However it seems this is not the only issue, as I found a weird compile error trying compile some code which uses an std::unordered_map..
I will research some more next when when I have time, as I do want to tackle this problem.
For what it's worth.. everything works well using VS2017. It seems VS2019 switched to 64 bits by default and some compiler behaviour also changed
@ZalgoSoft By the way, are you looking to do work on GPU or the CPU at 10Ms/s?
@ZalgoSoft I've updated the build, generate and archive scripts to explicitly tell CMake to generate project files for a 32-bit target. This fixes most of the issues. I've also addressed a few compile errors that only seem to happen with VS2019+. You should be able to build & run most of the apps and demos with these changes, including the graph system. I'm still trying to resolve a compile issue with one of the third party dependencies. I'm not sure why, but std::unordered_map gives a compile error on a simple map using std::string's when compiling ImGuiColorTextEdit.
@marcel303 thank you a lot, will try soon your code. All I need is audio processing / dataflow of your framework, I trying to make an data flow wich use advantages of GPGPU processing of audio or radio signals. So I choose openCL as most versatile solution, before this I did successfull computations on CUDA, but requirenment of today need for more generic solution. Question is about 1-10Msps of byte/complex/float flow , conversions, enchancement, filtering and visual rendering of waterfall and spectrum of signal
There are two copies of file
SSE2NEON.h
as the following paths:1stparty/binaural/sse2neon/SSE2NEON.h
users/marcel/ovr-cubes/SSE2NEON.h
They are with the identical content.
I am curious about the way how SSE2NEON is used for this project. Meanwhile, SSE2NEON is being actively developed via https://github.com/DLTcollab/sse2neon If SSE2NEON is used in this project, please consider to migrate to newer SSE2NEON, which brought more SSE intrinsics, performance enhancements, and fixes.