logicomacorp / WaveSabre

Official WaveSabre repository
MIT License
246 stars 34 forks source link

WIP: Port replayer to Linux #49

Open PoroCYon opened 4 years ago

PoroCYon commented 4 years ago

This one is absolutely not ready to be merged yet, but, I want to discuss a few things as I go along:

P.S. if needed, I can separate the non-Linux-specific patches (eg. using std::atomic, reworking the maths routines, etc.) into separate PRs, then rebase this one on them.


Roadmap:

emoon commented 4 years ago

I would really take a look at size generated when including stuff such as std::atomic and std::thread as it may add extra overhead that isn't really needed.

PoroCYon commented 4 years ago

Very good point, but, I can't try this myself (I don't have any Windows machines myself), so, would anyone else be willing to test this?

(std::atomic is compiled directly to cmpxchg/lock xadd instructions, from what I've seen in gdb, but, I don't know about std::thread of course.)

PoroCYon commented 4 years ago

Hm, turns out using std::thread will need a nontrivial amount of work (w. signalling between threads) to have it working properly, as it can't shut down threads forcefully by itself.

PoroCYon commented 4 years ago

Seems that only FFmpeg and libgsm (the latter kinda) support the GSM 6.10 codec WaveSabre uses, and neither are available by default on Ubuntu. Opus, Vorbis and Speex are available, however. So I'm thinking of doing the following:

Thoughts?

yupferris commented 4 years ago

Thanks for looking into this. Some thoughts:

I believe some of this work may overlap some of the work that's already been done/is being done by @kusma / @alkama / others. This topic comes up sometimes in the #development channel in our slack, usually under the broader umbrella of supporting both mac and linux to some degree. I got the impression from the demoscene discord that you weren't interested in slack, which is unfortunate, as it would be nice to try and coordinate these efforts. It's not crucial, though; I'm happy to accept helpful changes either way.

When it comes to linux support in general (really any platform that's not Windows), I'm actually a bit hesitant, because neither @djh0ffman nor I currently use these platforms, nor do we plan to in the future, so we cannot test them or ensure they're always maintained. Luckily we have some CI pipelines set up as well as interested folks that could take up this burden, but we'll have to communicate that this will likely be a lower "tier" in terms of support, and might need some TLC once in awhile. That said, the core parts of WS don't really change that often anymore, so maybe it's not that big of a deal. When I see a large diff like this with lots of conditional compilation, though, I must admit I feel a bit scared/concerned.

It's absolutely paramount that @djh0ffman (especially!) and I can work effectively making new devices and releasing intros. This is the #1 priority for the project still - that logicoma as a group is able to use our tool the way we intend to. Everything else is secondary. I think it's unlikely, but if a situation ever arose where we made a decision to merge a piece of code that broke x64/linux/mac support, even temporarily, we should not hesitate to do it if it serves one of our intros. We try to keep things backwards compatible and have generally done a good job of this, so it's not like we don't care about other users, but I feel this is an important reminder nonetheless. We will rely solely on the community to maintain non-Windows (really anything that's non-our-intros) - so my hope is that by contributing this, that you understand this and are open to being available for future maintenance if possible.

Anyways, what I see mostly looks like it's headed in the right direction, so thanks again for that. It will definitely be more consumable/easier to review and consider split out as separate PR's, but I totally get that it's easier to paint in broad strokes first, so we can do that later.

My thoughts so far on specific stuff:

I'm fine with inline asm where we need it even though it's clearly not ideal; historically this has only affected the win32 version of the player and we just use standard stuff for non-size-optimized builds (eg in the VST plugs, which previous cross-platform work has been mostly focused on). Since this work focuses mostly on the replayer, which adds complication in two ways - one being that now we need to support GCC, and the other being that we may need to support x64 as well. My gut feeling is that since this only really covers a couple functions it shouldn't be that too bad and I feel that small 2x2 matrix is manageable (technically 2x3 if we still want to fall back to standard stuff for some builds - which I'd actually like not to do if it can be avoided, again to reduce maintenance burden). However I'd like to see all impls match in the end as much as they can, especially if we rework the code to do pow better (which I'm in favor of btw, see my comments on #37). I also like the idea of using inline asm over the full body, it would be nice if that change were done for the windows builds too. If it makes things easier, we could consider only adding support for x86 first or something.

I'm concerned about the threading stuff. We know the current code ends up very small and efficient on Windows, and I assume you're at least able to do some kind of packing test on linux to see that that's also the case there, but if we choose to use std stuff we need to make sure it's still small and fast on Windows. If it is, and doesn't cause linker trouble etc, I wouldn't mind doing something cross-platform as the only impl, but this needs to be checked. I'd certainly prefer if only the threading-specific changes could be split out into a separate PR so those could be looked at/tested in isolation. I understand this might be difficult, so perhaps we can split out/land other stuff in the meantime and work towards that goal if possible.

GSM6.10 is a shit codec. Since Opus is supported on Windows 10, I'd much rather see Specimen deprecated in favor of a new device (eg a Specimen 2) that uses Opus instead. I believe this is the best path forward; I don't much like the idea of supporting multiple codecs in our devices if we can avoid it.

Similarly, Thunder is already a deprecated device; while I don't think we've formalized the deprecation process beyond letting the linker optimize dead code and some preliminary work to not build deprecated VSTs, I'd much rather not support it if we decide to change sample formats. I don't think it's worth getting this to work on other platforms either as it just increases maintenance burden, so I'd rather not have that in the main repo.

I've also experimented a bit with having our own custom codec which would obviously be supported on all platforms but this work is too early to count on at this point; it's just an experiment that may never work out. We'll see.

For Adultery, I'm actually leaning towards dropping it on platforms other than Windows. Perhaps an alternative device that's a linux-only replacement could make sense. In general I like the idea of having platform-specific devices where it makes sense; my hope with WS in general is not that folks would rely only on the standard devices, but that they'd add their own crazy stuff (on branches). There's loads of untapped potential here, and I imagine that's especially true for Linux!

I don't like the idea of conditionally supporting a non-threaded player. I think this just makes things more confusing with the conditionals, and most tracks produced nowadays are very CPU-heavy; this is not going to change anytime soon, so leaning into a threaded player is the best way forward.

PoroCYon commented 4 years ago

I believe some of this work may overlap some […]

I see. My wording on the Demoscene discord might've been a bit blunt, but, I'd rather keep the number of accounts I have on random websites to a minimum, especially if it's owned by an Evil™ corporation. But of course this makes coordination harder. (Initially this was just for asking for a test track, but, now it's for a different matter.)

When it comes to linux support in general […] It's absolutely paramount that […]

That's totally fair and understandable, and it's usually been like that with many projects in my experience. I'm used to cleaning up the mess to make things run on Linux again :P

I'm fine with inline asm where we need it […]

x64 support will be needed because on Linux, you typically don't have 32-bit libraries anymore, so either you have to use 64-bit code, or circumvent the library requirement somehow.

For the Pow2/Pow4 stuff, I think it's more efficient to just always implement it as an inline function that does just return x*x; or x=x*x; return x*x;, as compilers are likely able to optimize that in the best way possible.

I don't like the idea of conditionally[…]

In my own (4k) intro framework, the graphics code runs as a 64-bit process, while it forks into a 32-bit standalone program that produces audio and then sends everything to aplay (which is a tiny terminal program that just plays raw samples from a pipe), but, that's a bit hacky of course. But it's also why I initially made a multithreading-less option as well, as it runs in a completely separate process. Although this is still a reflection of my current 4k tools, I don't know how realistic this would be in practice, especially for 64ks.

(Furthermore, smol doesn't support TLS and some pthread init stuff out of the box yet (because nobody needs that in a 4k context, for which smol is made), but that's something I should fix on my end.)

So yeah, I guess I could remove this, at least for anything that isn't my private branch.

I'm concerned about the threading stuff […]

You can scrap the std::thread stuff already, it's not that viable on first sight due to its API anyway. std::atomic does seem interesting to look into, though, as it just compiles into the required assembly instructions without emitting any function calls, idk how that works with Windows' InterlockedCompareExchange etc. But someone (who is empathically not me) needs to test this first, of course.

GSM6.10 is a shit codec. Since Opus is […]

That's indeed not a bad idea, however, I'd like to at least be able to replay some tracks that aren't made specifically for Linux (or render them to a WAV), outside of a compo context. But that's not the main goal here of course.

Also, how many new features would there be needed in a Specimen 2, aside from Opus (and a bitrate slider maybe)? (I have no clue about the answer to this question, I'm definitely not a musician, at all.) If it's not much, then it might be easier to just keep using Specimen. But then there's the argument of still including the legacy GSM codepath, of course.

Similarly, Thunder is already a deprecated device […]

So that's why Renoise crashed when Molive tried to make a test track using it for me :P . I simply took Thunder first because its code looked a little simpler, but moved to Specimen for that reason.

For Adultery, I'm actually leaning towards dropping […]

A platform-specific device definitely makes sense here (along with maybe a libfftw-based effect or something. But, anyway,), but, again, I'd like to have some sort of playback functionaly for Windows tracks in a non-compo context, as a nice-to-have thing. (Eg. gm.dls is sometimes present in a wineprefix, so it can be stolen from there as a fallback or something.)

emoon commented 4 years ago

Just going to chip in a bit here. What I want to do is to add support for Wavesabre in my music player so this work being done is essential for me to do so, but at the same time I fully understand what Ferris saying that hacking can be done on the synth as required for shipping intro which totally makes sense.

My use case is really at the bottom of the list of what Wavesabre targets and if that can't be achived that isn't the end of the world but I think it just would be cool. A reasonable person would say: "Why not use the generated FLAC/WAV/MP3 instead?" I would say: "Now where is the fun in that?" :) But really one can do way more cool visualizations having the original data (such as tracker style playback, pianos, scopes for all channels, etc)

And yes, it will eat lots of CPU but to me that is fine. If someone can't use it because of CPU that's how it is, and the rendered version can be used instead.

Now as backwards compability has been mentioned above that is of course something I care about for my use-case otherwise I will have to deal with several version of the code-base to play specific songs and while doable it becomes kinda annoying so if it's possible to keep that I would be happy :)

PoroCYon commented 4 years ago

Ok so I've sent out #50 , #51 , #52 , #53 .

After this, I'll send a PR to disable all the Windows-only stuff on not-Windows, but this depends on #52 , so I'm first waiting for that one to be merged.

Then, I'd add the patches for pthread-based threading and an SDL2/aplay backend/renderthread. However, I'm not exactly sure in which order I'd send these, as these two are more or less interdependent. So what would be the preferred approach here? Send them chunk-by-chunk (and the first one then having a broken non-Windows compile), or both in one PR, or maybe first add the SDL2/aplay renderthread with only the single-core WAV writer enabled on non-Windows, and in a following PR, add the pthread stuff which reenables the prerender and realtime players, and multithreadd WAV writing? Either way, after those PRs, I'll send the ffmpeg and wineprefix fallbacks for Thunder/Specimen and Adultery.

(I'll be keeping this branch/PR for experimentation and WIP and misc stuff, and I'll rebase it on the current master every so often.)