VIDA-NYU / tile2net

Automated mapping of pedestrian networks from aerial imagery tiles
BSD 3-Clause "New" or "Revised" License
146 stars 22 forks source link

Segmentation fault while inferencing #59

Closed neilknowscomputers closed 3 months ago

neilknowscomputers commented 3 months ago

While running example.sh I received a segmentation fault error.

This was the output

Please enter the output directory:
poc
Tile generation will now begin.
INFO       Using Massachusetts as the source at location=(42.3536483721, -71.0716891532, 42.3555518995, -71.0643742337)
INFO       Using base_tilesize=256 from source
INFO       Stitching 12 tiles...
INFO       96 tiles missing out of 96 total.
           Downloading 96 files...                : 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 96/96 [00:00<00:00, 51955.25it/s]
           Downloading 96 tiles...                : 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 96/96 [00:00<00:00, 666.81it/s]
INFO       All 96 tiles are on disk.
           Stitching 6 tiles...                   : 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 12.69it/s]
INFO       Dumping to poc/example2/tiles/example2_256_info.json
INFO       Inferencing. Segmentation results will not be saved.
INFO       Using a single GPU.
INFO       Using Per Image based weighted loss
INFO       Using Cross Entropy Loss
INFO       Loading weights from: checkpoint=/home/ubuntu/tile2net/src/tile2net/raster/resources/assets/weights/satellite_2021.pth
INFO       init weights from normal distribution
INFO       loading pretrained model /home/ubuntu/tile2net/src/tile2net/raster/resources/assets/weights/hrnetv2_w48_imagenet_pretrained.pth
INFO       Trunk: hrnetv2
INFO       Model params = 72.1M
INFO       Using base_tilesize=256 from source
./examples/example.sh: line 10: 43410 Done                    python -m tile2net generate -l "$location" -o "$output_dir" -n $1
     43411 Segmentation fault      (core dumped) | python -m tile2net inference

Is this due to a lack of memory? Stack size? I'm running on a g3.4xlarge which has 122 GB of CPU memory and 8 GB of GPU memory. Do I need more?

I tried setting ulimit -s unlimited and got the same result. Any ideas? 🤔

Thanks for the help ❤️

neilknowscomputers commented 3 months ago

So I was able to get this to run after upping the machine resources to

It would be great to know what was needed (or if there was a way to configure how much memory is used)

Mary-h86 commented 3 months ago

@neilknowscomputers That’s great! Our inference process only uses one GPU and for the example case to be run, a 12G GPU should be good. In my experience, a NVIDIA 3080 Ti works for most cases I dealt with. You can also use google Colab T4 GPU to run the example (there is a colab example under doc in our repo that you can take a look at). But the exact memory requirements depends on your task (tile size and area coverage). Glad to know that it is working on your end. Let me know if you have any further questions in this regard, or we can close this issue.

Mary-h86 commented 3 months ago

I close this issue. Feel free to re-open if you have any other questions.