Closed jet082 closed 4 months ago
ChatGPT 4o is truly remarkable. On a lark, I pointed it at this repo, explained the problem, and asked it to help me out and it provided patches that fixed the problem. Now it's using the GPU and going quite quickly, even for (bs) roformers. I have no idea how much you can use from these files, but they do fix the issue for me. Perhaps they can help point you in the right direction.
Nice one! Could you raise a PR with the fix please?
Thing is, these are automated patches from chatgpt... They seem minor, but I don't know if they'll break your existing code? I can make a PR of course, but it might be better to simply diff the files and patch in strictly what is necessary.
For sure, that's why I want to see a PR 😄 (easiest way for me / others to see and review the diffs)
FWIW most of the code I've added to this codebase was heavily AI-assisted, initially using Github Copilot, then over the last 6 months or so using Cursor (https://cursor.sh) as my IDE instead of VSCode, which now uses gpt-4o for each suggestion / request!
Okay I made a PR, but please do not merge it blindly as I do not trust it to follow your coding conventions or anything.
Also, there have been no changes to the other roformers python code, so you will need to adopt the changes there too.
https://github.com/karaokenerds/python-audio-separator/pull/74
Hey @jet082 please could you try the latest release, version 0.17.2?
I believe I've fixed it for all roformer models with this commit 😄 https://github.com/karaokenerds/python-audio-separator/commit/a581da750a61e5ab25f70cc026bd296df61944c8
I can confirm that this fixes both the bs and mel roformer models!
This needs to be re-opened. I tried running the latest version on a rather large file. My GPU usage went up to 100%, but it listed it as taking over 2 days to complete.
I then reverted to my code in https://github.com/karaokenerds/python-audio-separator/pull/74. Like the current master, my GPU usage went up to 100%, but this time the estimate was 12 minutes instead of 2 days.
My guess is that the current code does an unnecessary transfer to the CPU and then back to the GPU or something like that. It might be best to simply adopt my PR or figure out how it works.
Sorry to hear that :/ I'll try and look into it a bit more when I have some free time; you're probably right re. the cause but I'm not sure at the moment.
Just as a heads up though, the reason I didn't want to merge PR #74 as it was is that while it may support CPU and CUDA, it doesn't support MPS (which is my daily driver as my laptop is a macbook) - specifically this line:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
It also disregards the previous (and battle-tested) logic in the main separator controller which detects the available inference device and configures them accordingly: https://github.com/karaokenerds/python-audio-separator/blob/main/audio_separator/separator/separator.py#L198
So if you decide you want to have another stab at making it work for you and also others more generally, ideally what we ought to do is pass through the configured device(s) to the BSRoformer
class when it's instantiated in the load_model
method in the MDXC class.
FYI. it seems work well for me: https://github.com/young01ai/python-audio-separator/tree/fix-cuda
Nice one @young01ai 😄 open a PR for that and I'll test and merge!
Nice one @young01ai 😄 open a PR for that and I'll test and merge!
PR: #84 , hope it's helpful. @beveradb
Nice one @young01ai 😄 open a PR for that and I'll test and merge!
PR: #84 , hope it's helpful. @beveradb
Thank you! I left a comment here, it's not working on my machine 🤔
This should now be fixed in audio-separator
version 0.17.5
onwards - huge thanks to @young01ai for PR #84 🙇
This is still not fixed.
Try this to see if it works on Mac - https://github.com/karaokenerds/python-audio-separator/pull/74
This is still not fixed.
It's working for me with the latest Roformer models using a modified version of the Dockerfile from this repo.
I commented elsewhere, but here is an issue since I believe this is a clear bug.
Running
model_bs_roformer_ep_317_sdr_12.9755
on UVR5 on a roughly 60 minute file called 03.wav takes about 10 minutes and I look over to my task manager and it is clearly using my GPU.With python-audio-separator, however, my GPU is not being used (looking over to my task manager) and the same file takes 3-4 hours to process. Python-audio-separator does use my GPU and is quite fast while using models like
UVR-MDX-NET-Inst_HQ_4.onnx
orMDX23C-8KFFT-InstVoc_HQ_2.ckpt
. It is only the roformer models that ignore my GPU.Here is the code below. Debug output from the command line version will be below that.
The output is as follows:
Command line output for
audio-separator -m 'model_bs_roformer_ep_317_sdr_12.9755.ckpt' --log_level debug -d .\03.wav