rotemtzaban / STIT

MIT License
1.2k stars 170 forks source link

Click problems? #25

Open NoUserNameForYou opened 2 years ago

NoUserNameForYou commented 2 years ago

While trying to run the first inversion step I get this:

"Traceback (most recent call last): File "\STIT\train.py", line 124, in main() File "\python\lib\site-packages\click\core.py", line 1130, in call return self.main(args, kwargs) File "\python\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "\python\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "\python\lib\site-packages\click\core.py", line 760, in invoke return __callback(args, kwargs) File "\STIT\train.py", line 62, in main _main(config, config=config) File "\STIT\train.py", line 76, in _main files = make_dataset(input_folder) File "\STIT\utils\data_utils.py", line 27, in make_dataset assert os.path.isdir(dir), '%s is not a valid directory' % dir AssertionError: /data/obama is not a valid directory"

(removed the full path names due to privacy reasons while I pasted the error log here)

I got this error while trying to install requirements.txt:

qt5-tools 5.15.2.1.2 requires click~=7.0, but you have click 8.1.2 which is incompatible.

I'm on Windows 11 with python 3.9.9 and cuda 11.6

If you could just release an easier way of doing this via Anaconda + template environment building file I'd be happy.

rotemtzaban commented 2 years ago

@NoUserNameForYou

Hi,

Did you install the requirements on a clean environment? For me on a Linux machine it does not install qt5-tools, was that requirement installed via our requirements.txt? Unfortunately I currently don't have a Windows machine to test on, so I can only test it on Linux.

NoUserNameForYou commented 2 years ago

I had a few things installed before. But shouldnt latest click work?

rotemtzaban commented 2 years ago

It does. It seems like the qt5-tools package is dependent on an older version of clip, which prevents the latest from being installed.

NoUserNameForYou commented 2 years ago

It does. It seems like the qt5-tools package is dependent on an older version of clip, which prevents the latest from being installed.

Still the same error after "pip install --upgrade --no-deps --force-reinstall click" and even after uninstalling qt5 tools.

I ran "python train.py --input_folder /data/obama --output_folder training_results --run_name obama --num_pti_steps 80" in cmd btw. I don't know how to use the .py files as they have \ at the end of each line and they fail at 1st.

I give up until it becomes user friendly maybe with a virtual env., thanks for trying to help at least.

rotemtzaban commented 2 years ago

@NoUserNameForYou Looking again at the error, it doesn't seem to even be from click. It looks like there is some issue with the input folder given. One option I'm thinking of is that maybe you've used unix style slashes(e.g /), instead of Windows style backslashes() as a separator. Also, I assume you've downloaded the sample videos and that they exist in the directory you give there.

NoUserNameForYou commented 2 years ago

Thanks, that was it. But not specifically that. because using "--input_folder \data\obama" didn't work either. It was the first backslash. doing "--input_folder data\obama" works now.

Got out of memory error and I'll dig for max split size mb options now. Thank oyu.

edit:

" Traceback (most recent call last): File "STIT\torch_utils\ops\bias_act.py", line 48, in _init _plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math']) File "STIT\torch_utils\custom_ops.py", line 64, in get_plugin raise RuntimeError(f'Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "{file}".') RuntimeError: Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "STIT\torch_utils\custom_ops.py"."

Since you're not pulling it from PATH I'll try to manually enter my VS19 custom folder location.

Edit 2: I give up. "torch_utils\ops\upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:"

Providing a virtual environment shouldn't be hard man. See how this guy did it for GPEN: https://github.com/Cioscos/GPEN-Cioscos

Edit 3: I provided the correct cl.exe location in the custom ops file and now it's compiling. Awaiting the next error.

Aaaand edit 4: As expected, here's the next error:

"Traceback (most recent call last): File "\STIT\train.py", line 124, in main() File "\python\lib\site-packages\click\core.py", line 1130, in call return self.main(args, kwargs) File "\python\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "\python\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "\python\lib\site-packages\click\core.py", line 760, in invoke return __callback(args, kwargs) File "\STIT\train.py", line 62, in main _main(config, config=config) File "\STIT\train.py", line 91, in _main ws = coach.train() File "\STIT\training\coaches\coach.py", line 142, in train generated_images = self.forward(w_pivot) File "\STIT\training\coaches\coach.py", line 93, in forward generated_images = self.G.synthesis(w, noise_mode='const', force_fp32=True) File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "", line 463, in forward File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "", line 397, in forward File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "", line 296, in forward File "\STIT\torch_utils\ops\bias_act.py", line 88, in bias_act return _bias_act_cuda(dim=dim, act=act, alpha=alpha, gain=gain, clamp=clamp).apply(x, b) File "\STIT\torch_utils\ops\bias_act.py", line 153, in forward y = _plugin.bias_act(x, b, _null_tensor, _null_tensor, _null_tensor, 0, dim, spec.cuda_idx, alpha, gain, clamp) RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 4.00 GiB total capacity; 2.58 GiB already allocated; 0 bytes free; 2.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"

And I'm out.