mgunyho / Little-Utils

Little Utils plugin for VCV Rack
European Union Public License 1.2
15 stars 5 forks source link

teleport - high CPU usage #14

Open martinkrz opened 1 year ago

martinkrz commented 1 year ago

I absolutely love the Teleport IN/OUT modules -- thank you for these. They've made a huge difference in being to organise my patches.

Unfortunately, it seems that Teleport OUT has very high CPU usage -- I'm seeing ~2% for each Teleport OUT module.

Is there any possibility in the module code for optimization?

Best,

m

mgunyho commented 1 year ago

Hi, thanks for the feedback!

Where exactly does the 2% usage come from? Do you see it in the CPU meter of Rack, or in your task manager?

The implementation of the teleport output is really simple, but it's possible that there is some room for optimization, as I haven't thought about performance that much.

martinkrz commented 1 year ago

Each OUT module was showing 2% in the CPU meter of rack.

I was poking with this a bit more, and it seems like VCV doesn't play nicely with multi-threading. On a 10-core M1, the more cores I choose with VCV, the worse things get.

I thought assigning 8 cores to VCV would be "smart", but it was actually causing a lot of audio glitches. And it was at this 8 core setting that the modules were showing 2%. Total rack CPU usage for my patch was 80%!

Setting VCV back to 1 core, dropped total usage to 40% and with this setting the OUT modules show 0.3% usage.

I just started with VCV rack and, I guess naively, thought assigning it more cores would be better. It seems not.

So I think my comment about high usage has much more to do with Rack than with your modules :)

But... (lol) one thing that would be great would be to have the option to have multiple instances of the same IN module. This would allow for setting up teleports to the same OUT module from multiple places.

Right now, there can only be one instance of an IN module with a given name. But, if I could have 3 instances of e.g. IN "drum" module, then I could do the following.

For example, one part of the patch might be doing the kick, and routing it via channel 1 of "drum" IN module.

kick drum modules -> IN (drum, instance 1, channel 1)

Then, elsewhere in the patch, a snare might do the same.

snare modules -> IN (drum, instance 2, channel 2) hihat modules -> IN (drum, instance 3, channel 3)

If I could have 3 instances of the IN drum module, then I don't need to route from the separate drum parts into the same IN drum module... which might not be close to each of the drum part modules.

I have no idea whether this is simple or hard to implement.

I could see this being immensely useful for consolidating signals into the same teleporter from various regions of the patch.

In particular, for the audio output into a mixer. One could have a "mix" teleport IN/OUT system, with various instances of mix "IN" to which I send the voices. Then, near the mixer I can have my mix OUT module. Right now, I have a separate IN/OUT system for each voice, which is fine (an amazing improvement over running cables all over the patch to the mixer) but would be even "cleaner" with multiple IN instances.

Thanks so much for Little Utils. I'd love to make a donation to the effort via PayPal -- where can I send the funds to?

m

mgunyho commented 1 year ago

I thought assigning 8 cores to VCV would be "smart", but it was actually causing a lot of audio glitches. And it was at this 8 core setting that the modules were showing 2%. Total rack CPU usage for my patch was 80%!

Do the audio glitches show up due to Teleport, or does it also happen in a patch with no Teleports? I think Teleport can cause audio glitching because of some thread-safety issues, it would be interesting to investigate if they really do occur. Although fixing those would be a bit more complicated.

I'd love to make a donation to the effort via PayPal

I really appreciate it! I am currently not accepting donations, because the burden of setting up the account and thinking about financials (even if there are no expectations associated with it) is not worth the hassle. If you want, you can donate to The Software Freedom Conservancy who support projects such as Inkscape and Git that make these modules possible, or alternatively The Internet Archive, or any other cause you think deserves it.

mgunyho commented 1 year ago

Oh, and about the multiple inputs idea: I chose to only have one possible input with a given label for a couple of reasons.

First, it is much simpler to implement, because if I wanted to mix (i.e. sum) the signals from multiple sources, I would have to make sure that the summing is happening on the same time step, so I would have to add additional synchronization (which I suspect would degrade performance). Additionally, it could make the UX more confusing, because it would be hard to tell where a signal is coming from.

martinkrz commented 1 year ago

Do the audio glitches show up due to Teleport,

I get glitches even without teleport when using more than 1 thread.

In fact, as soon as I put VCV in 2+ thread mode, other applications (e.g. YouTube) start glitching.

m

Martin Krzywinski science + art http://mkweb.bcgsc.ca

On Tue, Dec 27, 2022 at 7:16 PM mgunyho @.***> wrote:

I thought assigning 8 cores to VCV would be "smart", but it was actually causing a lot of audio glitches. And it was at this 8 core setting that the modules were showing 2%. Total rack CPU usage for my patch was 80%!

Do the audio glitches show up due to Teleport, or does it also happen in a patch with no Teleports? I think Teleport can cause audio glitching because of some thread-safety issues, it would be interesting to investigate if they really do occur. Although fixing those would be a bit more complicated.

I'd love to make a donation to the effort via PayPal

I really appreciate it! I am currently not accepting donations, because the burden of setting up the account and thinking about financials (even if there are no expectations associated with it) is not worth the hassle. If you want, you can donate to The Software Freedom Conservacy https://sfconservancy.org/donate/ who support projects such as Inkscape and Git that make these modules possible, or alternatively The Internet Archive https://archive.org/donate/, or any other cause you think deserves it.

— Reply to this email directly, view it on GitHub https://github.com/mgunyho/Little-Utils/issues/14#issuecomment-1366087364, or unsubscribe https://github.com/notifications/unsubscribe-auth/APCW2J6JBJHPGJ2JMUPLIM3WPMW6DANCNFSM6AAAAAATKLO7EM . You are receiving this because you authored the thread.Message ID: @.***>

mgunyho commented 1 year ago

Phew, good to hear that it's not just the fault of Teleport then :) Has anybody reported similar on the forums?

Do you think this issue can be closed? We could open a new one for the multi-inputs idea if you want. Although it might not be easy to do due to synchronization issues, as I said.

martinkrz commented 1 year ago

Before closing the issue, is it at all practical to look into the CPU usage of the modules?

I've attached a setup with 12x teleport in and 36x teleport out modules -- typical numbers for how many I use in my patches.

On a 2021 10-core M1 Max MacBook Pro (32 Gb RAM), this setup uses around 25% avg cpu. Given that multi-threading is glitchy (I'll post around about this some more), this uses up 1/4 of my available CPU power.

Given that no cables are actually connected in this test patch, I'm wondering whether the CPU usage is due to module housekeeping overhead in Rack, more than cycles spent in the module itself.

You'll notice that teleport IN modules use 0.0% in performance meter, but OUT use around 0.4-0.6%.

You did mention that the implementation of the Tel IN/OUT modules is straightforward. I'm wondering whether there's any kind of room for optimization.

Best,

m

Martin Krzywinski science + art http://mkweb.bcgsc.ca

On Thu, Jan 5, 2023 at 7:17 AM mgunyho @.***> wrote:

Phew, good to hear that it's not just the fault of Teleport then :) Has anybody reported similar on the forums?

Do you think this issue can be closed? We could open a new one for the multi-inputs idea if you want. Although it might not be easy to do due to synchronization issues, as I said.

— Reply to this email directly, view it on GitHub https://github.com/mgunyho/Little-Utils/issues/14#issuecomment-1371826405, or unsubscribe https://github.com/notifications/unsubscribe-auth/APCW2J7UHGB2VYWKTCAVMATWQZROZANCNFSM6AAAAAATKLO7EM . You are receiving this because you authored the thread.Message ID: @.***>

mgunyho commented 1 year ago

Hi, I could try to look into the CPU issue a bit in the near future (I don't have the Rack dev environment set up on my computer ATM), although I'm not much of an expert on performance. The implementation is really quite simple, so it's equally possible that it's doing something very stupid and the perf can be improved dramatically, or that there isn't much to optimize. 36 modules is quite a lot, but I suppose they still shouldn't take up that much CPU.

I can't see your attachment, looks like you tried to send it via email, maybe it has to be uploaded to github directly via the webpage?

Given that no cables are actually connected in this test patch, I'm wondering whether the CPU usage is due to module housekeeping overhead in Rack, more than cycles spent in the module itself.

The teleport output modules actually consume the same amount of CPU even with no cables connected, they simply read a value stored by their corresponding inputs. I'll have to compare the CPU consumption to other light-weight modules, like Bias/Semitone.

martinkrz commented 1 year ago

I could try to look into the CPU issue a bit in the near future

No worries -- thank you.

so it's equally possible that it's doing something very stupid

I'm hoping it's something as minor as that.

I've tried 72 bias/semitone modules. Without cables, these consume 13% average CPU.

I've attached the patch with 36 teleport modules.

teleport.v01.vcv.zip

Jon-Biz commented 5 months ago

Hi, I'm also looking at this, because I've created a version that scratches my personal itch: the ability to have multiple different inputs or outputs on the same module.

https://github.com/Jon-Biz/VCV-plugins

Working with your code has given me an awesome intro to plugin development and C++ coding. Thanks!

This is what is happening for every active output module process step:

    TeleportInModule *src = sources[label];
    for(int i = 0; i < NUM_TELEPORT_INPUTS; i++) {
        Input input = src->inputs[TeleportInModule::INPUT_1 + i];
        const int channels = input.getChannels();
        outputs[OUTPUT_1 + i].setChannels(channels);
        for(int c = 0; c < channels; c++) {
            outputs[OUTPUT_1 + i].setVoltage(input.getVoltage(c), c);
        }
        lights[OUTPUT_1_LIGHTG + 2*i].setBrightness( input.isConnected());
        lights[OUTPUT_1_LIGHTR + 2*i].setBrightness(!input.isConnected());
    }
    sourceIsValid = true;

If setChannels or getChannels Module methods are doing anything expensive before taking/returning the channels, then moving this out of the process step might make things work more smoothly?

If I solve this, I'll put up a PR.

mgunyho commented 5 months ago

Hi and thank you for the interest!

The source code for Port::setChannels() can be found here, and it is:

void setChannels(uint8_t channels) {
    // If disconnected, keep the number of channels at 0.
    if (this->channels == 0) {
        return;
    }
    // Set higher channel voltages to 0
    for (uint8_t c = channels; c < this->channels; c++) {
        voltages[c] = 0.f;
    }
    // Don't allow caller to set port as disconnected
    if (channels == 0) {
        channels = 1;
    }
    this->channels = channels;
}

This indeed possibly has a loop with up to channels iterations, but if the number of channels doesn't change, the loop shouldn't run at all. getChannels() is just return channels, so that's not a big performance issue.

I think the best approach for finding performance bottlenecks would be to really do some proper profiling and looking at flame graphs and such. Here I do not have much experience so I don't really know what specific tool to recommend and how to set it up, but the VCV plugin development manual has some tips. Another resource I can recommend is the lecture notes of the Programming Parallel Computers course from Aalto University, although the code here is not so massively parallel as the examples considered there.

In any case, if you find a way to increase the performance, I would be glad to integrate it to my code!

Working with your code has given me an awesome intro to plugin development and C++ coding. Thanks!

I'm glad to hear that! Although looking at my own code from almost 5 years ago now (whew!), I might disagree about it being a good introduction ;) In my opinion the Rack plugin SDK overall is an excellent starting point for writing C++ audio DSP. The documentation is great and the API and SDK are really well designed to get straight to the fun part, and to avoid unnecessary toiling with compilation and project setup etc usually associated in C++ code.

Relating to this:

the ability to have multiple different inputs or outputs on the same module.

How do you handle multiple inputs for the same module? Do you just sum them? One reason I explicitly chose to not allow multiple inputs is to avoid confusion about where some sound is coming from.