SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
https://shi-labs.com/natten/
Other
350 stars 27 forks source link

macos build failed. #5

Closed lucasjinreal closed 1 year ago

lucasjinreal commented 1 year ago
NATTEN/natten/src/cpp/natten1dqkrpb_cpu_kernel.cpp:59:26: error: expression is not assignable
                    updt[d1] += _qaddr[d1] * _kaddr[d1];
                    ~~~~~~~~ ^

with clang

alihassanijr commented 1 year ago

Hi,

Can you share your environment details, and the full error output? Also note you can install on mac with pip install natten directly now.

lucasjinreal commented 1 year ago

@alihassanijr hi, don;t know why, pip install natten just fail.

I am on M1 mac, with miniforge:

python
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33)
[Clang 13.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

g++

g++ --version
Apple clang version 14.0.0 (clang-1400.0.29.102)
Target: arm64-apple-darwin21.6.0
Thread model: posix
alihassanijr commented 1 year ago

Could you share your pytorch version as well? And the full error if you can grab that?

You could just run this and share out.txt:

pip install natten 2>&1 | tee out.txt
lucasjinreal commented 1 year ago

@alihassanijr hello, am unable to attach full log since it contains many senstive path info, here is the mainly one:

cv/exp/NATTEN/natten/src/cpp/natten1dqkrpb_cpu_kernel.cpp:59:26: error: expression is not assignable
                    updt[d1] += _qaddr[d1] * _kaddr[d1];
                    ~~~~~~~~ ^
    /Users/xx/work/codes/cv/exp/NATTEN/natten/src/cpp/natten1dqkrpb_cpu_kernel.cpp:214:47: note: in instantiation of function template specialization 'natten::natten1dqkrpb_cpu_forward_kernel<3, 1, 10, float>' requested here
            LAUNCH_DNA_KNS(kernel_size, dilation, natten1dqkrpb_cpu_forward_kernel,
                                                  ^
    /Users/xx/work/codes/cv/exp/NATTEN/natten/src/cpp/natten1dqkrpb_cpu_kernel.cpp:59:26: error: expression is not assignable
                    updt[d1] += _qaddr[d1] * _kaddr[d1];

I supect is your code has some syntax not suitable for clang or some sort of version mismatch, can u please have a test? (avoid using decent gramma in c++)

alihassanijr commented 1 year ago

I mean I'm unable to test your exact environment, but we've tested on both Intel and M1 macs and had no issues whatsoever. Can you share your torch version?

stevenwalton commented 1 year ago

Hi @jinfagang I'm testing on my M2 Air and I'm not having a problem.

$ python
Python 3.9.12 (main, Apr  5 2022, 01:53:17) 
[Clang 12.0.0 ] :: Anaconda, Inc. on darwin
$ g++ --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: arm64-apple-darwin22.1.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
$ sw_vers            
ProductName:        macOS
ProductVersion:     13.0.1
BuildVersion:       22A400
$ uname -a
Darwin redacted 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct  9 20:15:52 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8112 arm64

Can you show sw_vers? Which Mac are you on?

I followed these settings

$ pip install natten
$ python

(I suggest doing pip install natten --upgrade --no-cache-dir)

>>> import natten

Then I checked if things were built properly

$ git clone https://github.com/SHI-Labs/NATTEN
$ cd NATTEN
$ rm -fr natten # important so your environment doesn't get confused
$ python -m unittest discover -v -s ./tests

Can you write up your whole process so I can better understand (and try to reproduce) your error?

alihassanijr commented 1 year ago

FWIW, it successfully built and passed the tests on an Intel Mac and M1 Mac just now:

Intel

> g++ --v

Apple clang version 14.0.0 (clang-1400.0.29.102)
Target: x86_64-apple-darwin21.6.0
> sw_vers
ProductName:    macOS
ProductVersion: 12.6

M1

> g++ --v

Apple clang version 14.0.0 (clang-1400.0.29.102)
Target: arm64-apple-darwin21.6.0
> sw_vers
ProductName:    macOS
ProductVersion: 12.6
lucasjinreal commented 1 year ago

@alihassanijr @stevenwalton thanks for the info. That's weired, I still get error whether via pip or source:

image

my g++ and macOS version are same. Exception

Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33)
[Clang 13.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> torch.__version__
'1.12.1'
>>>

@stevenwalton Does M2 has official ARM based conda avaiable?

alihassanijr commented 1 year ago

I think I see what's going on. Based on those errors it's failing at AVX vectorizations, which should be disabled for ARM. Can you please try this and let me know if it succeeds?

git clone https://github.com/alihassanijr/NATTEN-Torch NATTEN
cd NATTEN
git checkout noavx

pip install -e .

This is just one commit ahead of the latest release, and it completely disables AVX, which I'm assuming should resolve your issue. If it does, we'll have to look into why your system in particular doesn't automatically skip over those definitions.

stevenwalton commented 1 year ago

@jinfagang I still need to know what system you are working on. We'll try to do our best to reproduce the issue but if we don't know what hardware you are on there is only so much we can do. As of now, the best advice I can give is try Ali's build. The only major difference I see is that your python is conda-forge packaged whereas mine is not. I don't use my M2 much for development so it is a pretty clean system.

lucasjinreal commented 1 year ago

@alihassanijr You are right, your code contains AVX impossibale runing on ARM. Although, I don't know why @stevenwalton can run AVX on ARM.

Anyway, I just notice your guys might not using native arm python, resottae maybe? Since conda it self doesn't have ARM based python, only miniforge has, as far as I know. \

Installing collected packages: natten
  Running setup.py develop for natten
Successfully installed natten-0.14.4

you can update your commit for real arm runable.

stevenwalton commented 1 year ago
$ which python
/opt/anaconda3/bin/python

I'm using anaconda's python.

alihassanijr commented 1 year ago

@jinfagang It's actually more complicated than that. We don't use AVX directly, we use PyTorch's abstraction of vectorized operations on CPUs, which on compatible architectures translates into AVX (you can read more on exactly how that's done here). And the fact that it runs on ARM (both M1 and M2 chips) proves that very point. We've actually tried both anaconda and local python, and had no issues running NATTEN on either.

My only guess at this point is that there's something else we're missing from your end. It could be that your pytorch install that's creating the issue, could be your python binaries. It would be helpful if you could share more about your environment, so that we could investigate the issue further, because even if we have an idea what might be the issue, which we don't at the moment, we'd still need to reproduce it to debug it.

And we cannot disable it for everyone, because not using vectorization considerably decreases throughput, and for no good reason for those who can already use it without a problem.

lucasjinreal commented 1 year ago

anaconda default doesn't have arm python? I think most of users runing ARM are using mini-forge which you guys didn't tested. Could be possible root issue for this bais. I believe this would be common issue for users running M1 with mini-force.

alihassanijr commented 1 year ago

Sorry I'm not sure I understand your last response.

stevenwalton commented 1 year ago

@jinfagang I want to make it clear that I am not using mini-forge on my M2 Air. I've provided all the details below. I want to make this clear because forge is the community version and that can come with certain issues. While it can be helpful for running bleeding edge stuff (like having access to conda on arm before it was wrapped into the official version), it is akin to running a beta release. I am not facing the same issue as you because conda supports M1/M2 chips as of May this year and so I never had to go down that route and I similarly don't expect this to be a large problem moving forward for similar reasoning.

I'll assign this topic to enhancements but our main focus is to provide software support on stable and main releases.

$ conda list anaconda
# packages in environment at /opt/anaconda3:
#
# Name                    Version                   Build  Channel
anaconda                  2022.05                  py39_0  
anaconda-client           1.9.0            py39hecd8cb5_0  
anaconda-navigator        2.1.4            py39hecd8cb5_0  
anaconda-project          0.10.2             pyhd3eb1b0_0  

$ conda --version
conda 4.13.0

$ conda info

     active environment : base
    active env location : /opt/anaconda3
            shell level : 1
       user config file : /Users/steven/.condarc
 populated config files : 
          conda version : 4.13.0
    conda-build version : 3.21.8
         python version : 3.9.12.final.0
       virtual packages : __osx=10.16=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /opt/anaconda3  (writable)
      conda av data dir : /opt/anaconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/osx-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/osx-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/anaconda3/pkgs
                          /Users/steven/.conda/pkgs
       envs directories : /opt/anaconda3/envs
                          /Users/steven/.conda/envs
               platform : osx-64
             user-agent : conda/4.13.0 requests/2.27.1 CPython/3.9.12 Darwin/22.1.0 OSX/10.16
                UID:GID : 503:20
             netrc file : None
           offline mode : False
alihassanijr commented 1 year ago

@jinfagang As far as your compile issue goes, upon looking into it further I noticed that the QK+RPB functions for CPU did have an incorrect usage of the vectorized abstractions that pytorch introduced since version 1.10.

I pushed a commit that should resolve that, as it completely removes the line you referenced. The difference between this and the other branch is that this one still maintains those vectorizations, therefore you don't lose the parallelization.

Can you please try this and let us know if it installs?

pip uninstall natten
pip install --no-cache-dir git+https://github.com/alihassanijr/NATTEN-Torch.git@78b8681a14bad3c3cb365b0ecfc18b6151c7d354
lucasjinreal commented 1 year ago

@alihassanijr I should share my torch version in the first place. I have torch 1.12 which is somewhat decent maybe.

alihassanijr commented 1 year ago

@jinfagang Yeah you did share that earlier. The point I was making was that the vectorized version is only available for torch above 1.10, so we already assumed you were using that.

But anyway, the last commit I shared should fix your issue and not lose the parallelization that's expected. So it would be great if you could try it.

And if you're okay with that we'd reopen this issue until we merge that commit into master.

lucasjinreal commented 1 year ago

Sure, I came to this problem because I want try OneFormer, but they seems didn't support CPU....

calebhemara commented 1 year ago

Hey! I also haven't been able to pip install natten on M1 Pro.

Stuck on "Installing collected packages: natten Running setup.py develop for natten"

Tried:

git clone https://github.com/alihassanijr/NATTEN-Torch NATTEN cd NATTEN git checkout noavx pip install -e

and

pip uninstall natten pip install --no-cache-dir git+https://github.com/alihassanijr/NATTEN-Torch.git@78b8681a14bad3c3cb365b0ecfc18b6151c7d354

python -c "import torch; print(torch.__version__)" 1.12.1

Any ideas? Thanks for your great work !

alihassanijr commented 1 year ago

Hi @calebhemara I think it's probably just compiling. Depending on your system and environment it can take anywhere between a minute and up to 30 minutes. Unfortunately we still don't have the means to build binaries for Mac, so compilation takes place on your local device. Can you please let us know if it gets stuck on that step for longer than say 10 mins?

Also keep in mind you should try the latter solution, and not the noavx one. So just do:

pip uninstall natten
pip install --no-cache-dir git+https://github.com/alihassanijr/NATTEN-Torch.git@78b8681a14bad3c3cb365b0ecfc18b6151c7d354
calebhemara commented 1 year ago

Hi @calebhemara I think it's probably just compiling. Depending on your system and environment it can take anywhere between a minute and up to 30 minutes. Unfortunately we still don't have the means to build binaries for Mac, so compilation takes place on your local device. Can you please let us know if it gets stuck on that step for longer than say 10 mins?

Also keep in mind you should try the latter solution, and not the noavx one. So just do:

pip uninstall natten
pip install --no-cache-dir git+https://github.com/alihassanijr/NATTEN-Torch.git@78b8681a14bad3c3cb365b0ecfc18b6151c7d354

You were right. I suspected this may be the case but was surprised by how long it took-- about ~20min. Thank you !