kuielab / mdx-net-submission

MIT License
119 stars 23 forks source link

Unable to install the repo and potential limitation for windows users #1

Open ghost opened 2 years ago

ghost commented 2 years ago

Thank you very much for you work and congratulation for your position in the ISMIR 2021 challenge.

1) I'd like to clone the repo but unfortunatly the GIT LFS storage seem to be down.

Cloning into 'mdx-net'...
remote: Enumerating objects: 262, done.
remote: Counting objects: 100% (262/262), done.
remote: Compressing objects: 100% (139/139), done.R
Receiving objects:  86% (226/262)used 228 (delta 107), pack-reused 0 eceiving objects:  85% (223/262)
Receiving objects: 100% (262/262), 5.10 MiB | 13.56 MiB/s, done.
Resolving deltas: 100% (124/124), done.
Updating files: 100% (35/35), done.
Downloading data/test/Mu - Too Bright/bass.wav (1.2 MB)
Error downloading object: data/test/Mu - Too Bright/bass.wav (4374ea2): Smudge error: Error downloading data/test/Mu - Too Bright/bass.wav (4374ea242e92d75ddd8d675a3774a321fc09520511a19c76de7b7d40a0070453): 
batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

2) It seem that you use signal.SIGALRM with the time_limit(seconds) function in the music_demixing.py source code. But this method do not work with the Windows OS. See the question on Stackoverflow

In all case, thank you very much for all your great work.

Ma5onic commented 2 years ago

I'm also having this issue

Error 1

(mdx-submit) F:\mdx-net-submission>python test.py
Traceback (most recent call last):
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 145, in run
    self.evaluation()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 120, in evaluation
    with time_limit(self.inference_setup_timeout):
  File "C:\Users\MDX-Debugger\.conda\envs\mdx-submit\lib\contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 27, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'

Traceback (most recent call last):
  File "F:\mdx-net-submission\test.py", line 64, in <module>
    submission.run()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 151, in run
    raise e
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 145, in run
    self.evaluation()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 120, in evaluation
    with time_limit(self.inference_setup_timeout):
  File "C:\Users\MDX-Debugger\.conda\envs\mdx-submit\lib\contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 27, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'

Error 2

(mdx-submit) F:\mdx-net-submission>python predict_blend.py
Traceback (most recent call last):
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 145, in run
    self.evaluation()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 120, in evaluation
    with time_limit(self.inference_setup_timeout):
  File "C:\Users\MDX-Debugger\.conda\envs\mdx-submit\lib\contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 27, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'

Traceback (most recent call last):
  File "F:\mdx-net-submission\predict_blend.py", line 84, in <module>
    submission.run()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 151, in run
    raise e
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 145, in run
    self.evaluation()
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 120, in evaluation
    with time_limit(self.inference_setup_timeout):
  File "C:\Users\MDX-Debugger\.conda\envs\mdx-submit\lib\contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "F:\mdx-net-submission\evaluator\music_demixing.py", line 27, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'
ws-choi commented 2 years ago

Hi all,

It seems too many users have downloaded via git-lfs. It is great for us, but the capacity ran out too quickly. I am sorry about this issue, and I will find an alternative way to provide checkpoints more stably.

About windows error, we encountered the same issue, and tbh we have not solved that issue. I think it might be caused by the code from AIcrowds, not our original code, because we met this error when we ran the AICrowd's code base without any changes right after forking the original repo. If you run this code in a Linux-based system then it could work, hopefully.

Anyway I will look into it sooner or later.

Currently I am working on other projects, but I will address these issues asap. Thank you for your attention.

Zokhoi commented 2 years ago

Per this comment, the leaderboard_A branch now supports Windows, with installation instructions in the README.

Ma5onic commented 2 years ago

@ws-choi

Hi all,

It seems too many users have downloaded via git-lfs. It is great for us, but the capacity ran out too quickly. I am sorry about this issue, and I will find an alternative way to provide checkpoints more stably.

About windows error, we encountered the same issue, and tbh we have not solved that issue. I think it might be caused by the code from AIcrowds, not our original code, because we met this error when we ran the AICrowd's code base without any changes right after forking the original repo. If you run this code in a Linux-based system then it could work, hopefully.

Anyway I will look into it sooner or later.

Currently I am working on other projects, but I will address these issues asap. Thank you for your attention.

I have previously been able to get it working and still have the repo that I used at that time. I'll get back to you if I spot any interesting/breaking differences.

Ma5onic commented 2 years ago

@ws-choi @Zokhoi

UPDATE! Found a workaround for Windows users!

Sorry it took so long, I forgot that I offered to troubleshoot this... I finally got around to setting up a bare metal windows install for troubleshooting this issue and sifted through my abyss of old hard drives to find the code that previously worked for me before I switched to linux.

Findings & Observations:

Both the test.py and predict_blend.py scripts errored-out starting at line 27 of music_demixing.py when using the signal library. In the documentation for that library, we can find this statement:

On Windows, signal() can only be called with SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, SIGTERM, or SIGBREAK. A ValueError will be raised in any other case. Note that not all systems define the same set of signal names; an AttributeError will be raised if a signal name is not defined as SIG* module level constant.

My first work-around (not a fix) wasn't very pretty and only consisted of commenting out most of the signal_handler function and ending it with yeild so that is essetially did nothing. That way python wont yell at me for not having definded it.

Now that I took a closer look, I decided to make a better work-around (not a fix) for this ValueError that can be merged with the project without removing the time_limit() functionality for linux/mac users. This can can be done by editing music_demixing.py and importing platform from sys and adding an "if" statement that will only execute time_limit() if the user is not running a Windows operating system:

To this:

def time_limit(seconds):
    if platform != "win32":
        def signal_handler(signum, frame):
            raise TimeoutException("Prediction timed out!")

            signal.signal(signal.SIGALRM, signal_handler)
            signal.alarm(seconds)
            try:
                yield
            finally:
                signal.alarm(0)
    else:
        yield

Note to about code above noobs like me:

The yield statement suspends a function’s execution and sends a value back to the caller, but retains enough state to enable the function to resume where it left off. When the function resumes, it continues execution immediately after the last yield run. This allows its code to produce a series of values over time, rather than computing them at once and sending them back like a list.[^1][^2] This will stop the error from occurring, but the test will not time out. Thus why it is only a work-around and not a permanent if

I found this workaround thanks to a closed issue in an unrelated github repo that used the signal library in a similar method as this repo: https://github.com/jeffbass/imagenode/issues/7#issue-673578509

Possible Long-Term Fixes:

To properly FIX this issue, the code in music_demixing.py should be changed to not use SIGALRM.[^3] One idea could be to use the built-in "timeit" library for deciding when to raise a timeout timeout error.

To look for hints on other ways how to properly fix this, I looked at the replies from the owner of the unrelated repo that I just mentioned:

"Thanks for this bug report. I want to support Windows users and will need to remove references to the unix / Linux SIGALRM from the receive_test.py program but also from the imagenode.py call to the Patience class. The SIGALRM unix OS signal is a simple timer, but it runs asynchronously outside of Python. I will need to replace it with another algorithm."

He eventually fixed this by replacing signal.SIGALRM call with a Python thread and a sleep timer as seen here: https://github.com/jeffbass/imagenode/commit/0bc4cd0ec1039d9cde36443938d819d551e4b4aa https://github.com/jeffbass/imagenode/commit/90adf4afc06d4d7bd91a89e70736fda8d5eb2e29

Footnotes/References:

[^1]: - When to use yield instead of return in Python? https://www.geeksforgeeks.org/use-yield-keyword-instead-return-keyword-python/ [^2]: - In Python, what is the difference between pass and return? https://stackoverflow.com/questions/7872611/in-python-what-is-the-difference-between-pass-and-return [^3]: - Fix Python Signal AttributeError: module 'signal' has no attribute 'SIGALRM' – Python Tutorial https://www.tutorialexample.com/fix-python-signal-attributeerror-module-signal-has-no-attribute-sigalrm-python-tutorial/

Ma5onic commented 2 years ago

Here is the output after a fresh miniconda install, of the successful runs from windows anaconda prompt:

(mdx-submit) F:\mdx-net-submission>python test.py
F:\mdx-net-submission/data/test/Mu - Too Bright\mixture.wav
Mixture file is present at following location: F:\mdx-net-submission/data/test/Mu - Too Bright\mixture.wav
F:\mdx-net-submission/data/test/Mu - Too Bright\mixture.wav: prediction completed.
Successfully generated predictions!
(mdx-submit) F:\mdx-net-submission>python predict_blend.py
F:\mdx-net-submission/data/test/Mu - Too Bright\mixture.wav
2.300908327102661
F:\mdx-net-submission\predict_blend.py:50: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_new.cpp:204.)
  mix_waves = torch.tensor(mix_waves, dtype=torch.float32)
C:\Users\MDX-Debugger\.conda\envs\mdx-submit\lib\site-packages\torch\functional.py:606: UserWarning: stft will soon require the return_complex parameter be given for real inputs, and will further require that return_complex=True in a future PyTorch release. (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:803.)
  return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
F:\mdx-net-submission\models.py:154: UserWarning: istft will require a complex-valued input tensor in a future PyTorch release. Matching the output from stft with return_complex=True.  (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:979.)
  x = torch.istft(x, n_fft=self.n_fft, hop_length=self.hop, window=self.window, center=True)
7.308706283569336
Successfully completed music demixing...

Please be aware, I noticed that both scripts are messing with the type of slash used to represent a directory. This didn't cause any issues for me when running the current scripts, but might cause issues developers using windows. Using powershell instead of cmd.exe could serve as a workaround for that because PS understands both and linux directory syntax.

PS: The code above is highlighted as powershell syntax.

Ma5onic commented 2 years ago

Until the Pull Request is merged, windows users can manually edit their files like this: https://github.com/kuielab/mdx-net-submission/pull/19/commits/5f1eb7a3abc56ea120d17ebdc545013c63616f4a