NASA-IMPACT / hls-foundation-os

This repository contains examples of fine-tuning Harmonized Landsat and Sentinel-2 (HLS) Prithvi foundation model.
Apache License 2.0
304 stars 79 forks source link

Fix the SETUP tutorial please #24

Closed anumerico closed 9 months ago

anumerico commented 12 months ago

We've been trying to make that work for 24 hours now following your instructions in the repo. It seems hopeless. Would be nice to have good resources for the Nasa Space APPS. Command: mim train mmsegmentation /content/mmsegmentation/hls-foundation-os/configs/multi_temporal_crop_classification.py Error: from mmcv.cnn.utils import revert_sync_batchnorm ImportError: cannot import name 'revert_sync_batchnorm' from 'mmcv.cnn.utils'

CarlosGomes98 commented 12 months ago

Hi @anumerico . Can you please confirm the version of mmcv you have installed? Version 1.6.2 should have revert_sync_batchnorm in mmcv.cnn.utils

anumerico commented 12 months ago

Ok, some progress with that.. Now getting a new error: Fail to import MultiScaleDeformableAttention from mmcv.ops.multi_scale_deform_attn, You should install mmcv-full if you need this module. warnings.warn('Fail to import MultiScaleDeformableAttention from '

return _bootstrap._gcd_import(name[level:], package, level) ModuleNotFoundError: No module named 'mmcv._ext'

anumerico commented 12 months ago

mmcv-full fixes it and now getting: UserWarning: --gpus is deprecated because we only support single GPU mode in non-distributed training. Use gpus=1 now. warnings.warn('--gpus is deprecated because we only support '

CarlosGomes98 commented 12 months ago

Could you confirm that you've installed mmcv-full, and not mmcv? The error message seems to suggest that you do not have mmcv-full.

Not the install in the readme for mmcv is mim install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/{cuda_version}/{torch_version}/index.html

CarlosGomes98 commented 12 months ago

mmcv-full fixes it and now getting: UserWarning: --gpus is deprecated because we only support single GPU mode in non-distributed training. Use gpus=1 now. warnings.warn('--gpus is deprecated because we only support '

This is just a warning, and should be ok :) you may drop the --launcher pytorch part of the command if you are training on single gpu, and that warning should disappear as a result

anumerico commented 12 months ago

No, I actually get to that warning already dropping the --launcher pytorch if not i get another error: TypeError: FormatCode() got an unexpected keyword argument 'verify' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) localrank: 0 (pid: 39863) of binary: /usr/local/bin/python3.10 And in fact when i drop the launcher i get more than the warning: text, = FormatCode(text, style_config=yapf_style, verify=True) TypeError: FormatCode() got an unexpected keyword argument 'verify'

Do you think it's the python 3.9 requisite? Are you taking part in the Nasa Space Apps? Got a discord username?

Im trying to follow the setup exactly, with its deps and all like at did at the start at the day.. Can you confirm that setup works?

CarlosGomes98 commented 12 months ago

This is probably from that python 3.9 prerequisite, yes. you can try pip install yapf==0.40.1, which may fix this (might require an uninstall of yapf first)

anumerico commented 12 months ago

I was cutting corners to make it work in colab, that might be it.. Do you know how to run that command in colab?: !conda activate environment-name

anumerico commented 12 months ago

Found a way, hopefully this time can make it work.. Thanks for your support. Still issues though: Command: pip install torch==1.11.0+cu115 torchvision==0.12.0+cu115 --extra-index-url https://download.pytorch.org/whl/cu115 Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu115

ERROR: Could not find a version that satisfies the requirement torch==1.11.0+cu115 (from versions: 1.13.0, 1.13.1, 2.0.0, 2.0.1, 2.1.0) ERROR: No matching distribution found for torch==1.11.0+cu115

anumerico commented 11 months ago

Managed to sort out the former, now Im here: COMMAND: mim install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/cu115/torch1.11.0/index.htm ERROR: Building wheels for collected packages: mmcv-full error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. Building wheel for mmcv-full (setup.py) ... error ERROR: Failed building wheel for mmcv-full Running setup.py clean for mmcv-full Failed to build mmcv-full ERROR: Could not build wheels for mmcv-full, which is required to install pyproject.toml-based projects

anumerico commented 11 months ago

Still here have tried again to really stick to the instructions, still not working, its ridiculous just the setup has all those issues, very annoying. Have tried to install locally and followed the instructions exactly with a virutal environment still getting errors:

RuntimeError: TemporalEncoderDecoder: TemporalViTEncoder: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I would like proper SETUP instructions for local and colab.

agraham9966 commented 11 months ago

I was so near defeat, but alas I have prevailed on top! I am now able to run model_inference.py. Haven't tried training etc. so not sure if I will run into other issues. But this is how I got it working on my local machine (Windows 10). Hope this helps others who are also having issues with setup:

conda create -n hlsdl python=3.9

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

git clone https://github.com/NASA-IMPACT/hls-foundation-os.git D:\hls-foundation-os

# cloning mmsegmentation to local dir (short directory path) 
git clone https://github.com/open-mmlab/mmsegmentation.git D:\mmsegmentation 

cd D:\mmsegmentation 

git checkout 186572a3ce64ac9b6b37e66d58c76515000c3280 # commit used for hls-foundation 

cd D:\hls-foundation-os

At this point I had to modify setup.py. Running 'pip install -e .' was throwing errors since it was trying to install mmsegmentation in a temp directory, which exceeded the allowable character limit in windows. I suspect others will have this issue as well since I tried on two machines and same issue. The workaround was to modify setup.py so it installs from the cloned mmsegmentation:

from setuptools import setup

setup(
    name='geospatial_fm',
    version='0.1.0',    
    description='MMSegmentation classes for geospatial-fm finetuning',
    author='Paolo Fraccaro, Carlos Gomes, Johannes Jakubik',
    packages=['geospatial_fm'],
    license="Apache 2",
    long_description=open("README.md").read(),
install_requires=
[
    "mmsegmentation @ file:///D:/mmsegmentation",
    "rasterio",
    "tifffile",
    "einops",
    "timm==0.4.12",
    "tensorboard",
    "imagecodecs"
    ]
)

Then continue with setup:

pip install -e .     

pip install -U openmim

mim install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/cu115/torch1.11.0/index.html

conda install -c conda-forge opencv

pip install datasets 

Doing this got it to work on my machine. What a hack job but works. Hope all the pain and suffering I went through at least helps someone.

CarlosGomes98 commented 11 months ago

Hi,Thank you so much for your work on this! It's always hard to support windows because of sneaky issues like these...I would really encourage you to submit a PR with these steps added to the README as a special case for Windows users. Like this, this would be logged as a contribution from you to this open source community :)On Oct 26, 2023 01:33, agraham9966 @.***> wrote: I was so near defeat, but alas I have prevailed on top! I am now able to run model_inference.py. Haven't tried training etc. so not sure if I will run into other issues. But this is how I got it working on my local machine (Windows 10). Hope this helps others who are also having issues with setup: conda create -n hlsdl python=3.9

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

git clone https://github.com/NASA-IMPACT/hls-foundation-os.git D:\hls-foundation-os

cloning mmsegmentation to local dir (short directory path)

git clone https://github.com/open-mmlab/mmsegmentation.git D:\mmsegmentation

cd D:\mmsegmentation

git checkout 186572a3ce64ac9b6b37e66d58c76515000c3280 # commit used for hls-foundation

cd D:\hls-foundation-os

At this point I had to modify setup.py. Running 'pip install -e .' was throwing errors since it was trying to install mmsegmentation in a temp directory, which exceeded the allowable character limit in windows. I suspect others will have this issue as well since I tried on two machines and same issue. The workaround was to modify setup.py so it installs from the cloned mmsegmentation: from setuptools import setup

setup( name='geospatial_fm', version='0.1.0',
description='MMSegmentation classes for geospatial-fm finetuning', author='Paolo Fraccaro, Carlos Gomes, Johannes Jakubik', packages=['geospatial_fm'], license="Apache 2", long_description=open("README.md").read(), install_requires= [ "mmsegmentation @ file:///D:/mmsegmentation", "rasterio", "tifffile", "einops", "timm==0.4.12", "tensorboard", "imagecodecs" ] )

Then continue with setup: pip install -e .

pip install -U openmim

mim install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/cu115/torch1.11.0/index.html

conda install -c conda-forge opencv

pip install datasets

Doing this got it to work on my machine. What a hack job but works. Hope all the pain and suffering I went through at least helps someone.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

agraham9966 commented 11 months ago

just a note to this - I did submit a PR. You may or may not need to downgrade yapf (described more in earlier post above):

pip uninstall yapf 
pip install yapf==0.40.1  

I had actually done this earlier on and wasnt able to verify if I needed to or not. Its excluded from my pull request for now but if someone else tries my set up and finds you need to do this please let me know.

Otherwise, yes, I was able to get everything working - including training - on windows 10.

CarlosGomes98 commented 9 months ago

Pushed changes to address this. Linked to #42.

LucFrachon commented 8 months ago

I was so near defeat, but alas I have prevailed on top! I am now able to run model_inference.py. Haven't tried training etc. so not sure if I will run into other issues. But this is how I got it working on my local machine (Windows 10). Hope this helps others who are also having issues with setup:

conda create -n hlsdl python=3.9

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

git clone https://github.com/NASA-IMPACT/hls-foundation-os.git D:\hls-foundation-os

# cloning mmsegmentation to local dir (short directory path) 
git clone https://github.com/open-mmlab/mmsegmentation.git D:\mmsegmentation 

cd D:\mmsegmentation 

git checkout 186572a3ce64ac9b6b37e66d58c76515000c3280 # commit used for hls-foundation 

cd D:\hls-foundation-os

At this point I had to modify setup.py. Running 'pip install -e .' was throwing errors since it was trying to install mmsegmentation in a temp directory, which exceeded the allowable character limit in windows. I suspect others will have this issue as well since I tried on two machines and same issue. The workaround was to modify setup.py so it installs from the cloned mmsegmentation:

from setuptools import setup

setup(
    name='geospatial_fm',
    version='0.1.0',    
    description='MMSegmentation classes for geospatial-fm finetuning',
    author='Paolo Fraccaro, Carlos Gomes, Johannes Jakubik',
    packages=['geospatial_fm'],
    license="Apache 2",
    long_description=open("README.md").read(),
install_requires=
[
    "mmsegmentation @ file:///D:/mmsegmentation",
    "rasterio",
    "tifffile",
    "einops",
    "timm==0.4.12",
    "tensorboard",
    "imagecodecs"
    ]
)

Then continue with setup:

pip install -e .     

pip install -U openmim

mim install mmcv-full==1.6.2 -f https://download.openmmlab.com/mmcv/dist/cu115/torch1.11.0/index.html

conda install -c conda-forge opencv

pip install datasets 

Doing this got it to work on my machine. What a hack job but works. Hope all the pain and suffering I went through at least helps someone.

@agraham9966 THANK YOU!!! I was so close to giving up, but your recommendations are the only ones that made it work for me -- and I'm on Ubuntu, not Windows. I would just add rasterio back to the package list in setup.py, and yes, uninstalling and downgrading yapf was required for me.