Project-AgML / AgML

AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.
Apache License 2.0
167 stars 28 forks source link

OSError: Encountered an error when generating synthetic data. Process returned code -6. #39

Closed ctyeong closed 1 year ago

ctyeong commented 1 year ago

I have faced this error below when running:

generator = agml.synthetic.HeliosDataGenerator(opt)
generator.generate(name = 'AV-GOblet6', num_images = 1)

It mentions something about graphics. Actually, I was connecting my local machine to a remote server via Jupyter Notebook. Would this setting matter? Thanks for your helps in advance!

Loading XML file: /home/username/anaconda3/envs/agml/lib/python3.8/site-packages/agml/_helios/Helios/projects/SyntheticImageAnnotation/xml/style_AV-GOblet6.xml...done.
Reading XML file: /home/username/anaconda3/envs/agml/lib/python3.8/site-packages/agml/_helios/Helios/projects/SyntheticImageAnnotation/xml/style_AV-GOblet6.xml...Building canopy of goblet trellis grapevine...done.
Canopy consists of 1304 leaves and 942970 total primitives.
Ground geometry...done.
Ground consists of 1 total primitives.
done.
/home/username/.agml/synthetic/AV-GOblet6/image0Rendering RGB image containing 942.971K primitives...Initializing graphics...Failed to initialize graphics.
Common causes for this error:
-- OSX
  - Is XQuartz installed (xquartz.org) and configured as the default X11 window handler?  When running the visualizer, XQuartz should automatically open and appear in the dock, indicating it is working.
-- Linux
  - Are you running this program remotely via SSH? Remote X11 graphics along with OpenGL are not natively supported.  Installing and using VirtualGL is a good solution for this (virtualgl.org).
terminate called after throwing an instance of 'int'
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[13], line 5
      2 generator = agml.synthetic.HeliosDataGenerator(opt)
      4 # Generate the data.
----> 5 generator.generate(name = 'AV-GOblet6', num_images = 1)

File ~/anaconda3/envs/agml/lib/python3.8/site-packages/agml/synthetic/generator.py:410, in HeliosDataGenerator.generate(self, name, num_images, output_dir, convert_data, clear_existing_files)
    405         raise OSError(f"Encountered an error when generating synthetic "
    406                       f"data. Process returned code {process.returncode}, "
    407                       f"suggesting that the program ran out of memory. Try "
    408                       f"passing a smaller environment for generation.")
    409     else:
--> 410         raise OSError(f"Encountered an error when generating synthetic "
    411                       f"data. Process returned code {process.returncode}.")
    413 # Convert the dataset format.
    414 if convert_data:

OSError: Encountered an error when generating synthetic data. Process returned code -6.
masonearles commented 1 year ago

@dariojavo Is this an issue with running in a notebook? Or is the issue with running via ssh?

@ctyeong Do you encounter the same issue locally?

ctyeong commented 1 year ago

@masonearles It works fine when running locally ...

masonearles commented 1 year ago

Currently there are some hoops to jump through to run remotely (headless) due to the need for a graphics driver. I believe you can use a VNC or Remote Desktop to do it. I believe @bnbailey-psl is working on a headless option for Helios. @amogh7joshi could also probably advise on running with VNC.

On Fri, Jan 27, 2023 at 09:20 Taeyeong Choi @.***> wrote:

@masonearles https://github.com/masonearles It works fine when running locally ...

— Reply to this email directly, view it on GitHub https://github.com/Project-AgML/AgML/issues/39#issuecomment-1406811260, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG5FGPBMSPCQCRMTL75TINTWUP7WZANCNFSM6AAAAAAUIAHOQE . You are receiving this because you were mentioned.Message ID: @.***>

amogh7joshi commented 1 year ago

I think you'll need to work out some settings using TurboVNC (installing it on the remote machine and local machine). I'm not entirely sure how this is done, though, it was set up already when I started using it.

You might get lucky if a simple ssh -Y works, though.

ctyeong commented 1 year ago

Actually, as the error message says on my Mac, XQuartz automatically opens and appears in the dock as the generator is run, but the error comes up.

@amogh7joshi ssh -Y did not work for me.

ctyeong commented 1 year ago

Interesting. Now I've found that the same error occurs even when running it locally on the Ubuntu server. It works locally if the notebook has initially opened from the same machine.

bnbailey-psl commented 1 year ago

If I understand correctly, the issue is related to running OpenGL remotely over SSH. This does not work natively, and you need an additional utility. To render graphics remotely using X11 forwarding over SSH, you can use VirtualGL. The downside to this is that it will be slow since it will render graphics over SSH. The better option is to use something like TurboVNC, which does the actual rendering on the remote machine and rendering speed should not be tied to network speed.

ctyeong commented 1 year ago

I've tried using VirtualGL and TurboVNC, but I got the error below. I've found that RGB images have all been generated from Helios, but I guess while the segmentation process is being finalized, that error occurs.

Screenshot 2023-02-11 at 1 51 01 AM
dariojavo commented 1 year ago

Hi Taeyeong,

Do you have this issue when running locally too? or it is only with VirtualGL and TurboVNC?

ctyeong commented 1 year ago

Hi @dariojavo, I have no issue when running locally but only with VirtualGL and TurboNVC.

ctyeong commented 1 year ago

@dariojavo This error somehow started to occur in my Ubuntu server even when it's running locally :( Could this be related to the Nvidia GPU or its driver?

This time, it says "buffer overflow detected", but the machine is a Linux server (Ubuntu 18) with a plenty of resources.

writing JPEG image: /home/username/.agml/synthetic/tomato_sample31/image0/view00000/RGB_rendering.jpeg
writing JPEG image: /home/username/.agml/synthetic/tomato_sample31/image0/view00001/RGB_rendering.jpeg
writing JPEG image: /home/username/.agml/synthetic/tomato_sample31/image0/view00002/RGB_rendering.jpeg
done.
Generating labeled image containing 1 label groups...
Initializing graphics...done.
Performing semantic segmentation for view 0... and element: 
*** buffer overflow detected ***: /home/username/anaconda3/envs/agml2/lib/python3.10/site-packages/agml/_helios/Helios/projects/SyntheticImageAnnotation/build/SyntheticImageAnnotation terminated

Same "buffer overflow" issue has been observed on the WSL of my Windows machine as well.

dariojavo commented 1 year ago

Hi Taeyeong,

This error occurs when the path is too long, will need to modify it from Helios to correct this error. You can solve it from now by reducing the path length.

ctyeong commented 1 year ago

@dariojavo That is a very interesting solution, and I'm surprised that works without the "buffer overflow" issue! Thank you!