VCityTeam / py3dtilers

Tilers accepting various input formats (OBJ, 3DCity databases, GeoJson, IFC) and producing 3DTiles tilesets.
Other
175 stars 48 forks source link

CityTiler process gets killed when converting CityGML with too many/heavy textures #35

Closed jailln closed 2 years ago

jailln commented 2 years ago

When running the CityTiler with too many / too heavy textures, the process gets killed by my OS (Ubuntu 20.04). According to the kernel logs, it seems to be due to memory usage:

[267574.260974] Out of memory: Killed process 1606130 (citygml-tiler) total-vm:7268524kB, anon-rss:6205768kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12756kB oom_score_adj:0

It happens with the 1st borough of Lyon 2018 vintage for instance.

Do you have the same issue ?

May be some memory is not correctly released or all textures are loaded at once ?

LorenzoMarnat commented 2 years ago

I never encountered this issue, but I think the memory gets overloaded by the textures. In order to create texture atlas for the tiles, the Tiler opens all the .png at once and close them only at the end of the process.

I'll try to find a way to open the textures one by one, and not all at once.

jailln commented 2 years ago

Great, thanks! Does the tiler opens all the textures of the tileset at once? Or all the textures of a tile at once?

LorenzoMarnat commented 2 years ago

I think it opens the texture of the whole tileset. Maybe @clementcolin knows much about it.

Sadly, I don't see any quick solution to fix your problem on the master branch. I've planned to work on the CityTiler textures really soon (this week or the next one), so I'll see if I can fix it.

But if you can split your borough into smaller parts (each part in a different CityGML file/database) a temporary solution could be to run the CityTiler on each part separately (to create a tileset per part). Then, with the Tileset Reader, merge all the tilesets into one.

This Tileset Reader is still in developpement, to use it switch on the branch tileset_reader and re-run the pip install -e .. To merge textured tilesets together, run something like:

 tileset-reader --path tileset1_path --merge tileset2_path tileset3_path --with_texture

There is no limit for the number of tileset paths after the --merge flag. The paths after the flags must be the paths to the root of each tileset. All the tilesets must be of the following form:

root
      tileset.json
      tiles
            .b3dm files
            .png files (textures)

(this is the hierarchy created by our Tilers)

jailln commented 2 years ago

I think it opens the texture of the whole tileset. Maybe @clementcolin knows much about it.

Maybe it should only opens the textures of the tiles since I guess that there is one atlas per tile ?

Sadly, I don't see any quick solution to fix your problem on the master branch. I've planned to work on the CityTiler textures really soon (this week or the next one), so I'll see if I can fix it.

Cool, thanks! Let me know :)

But if you can split your borough into smaller parts (each part in a different CityGML file/database) a temporary solution could be to run the CityTiler on each part separately (to create a tileset per part). Then, with the Tileset Reader, merge all the tilesets into one.

This Tileset Reader is still in developpement, to use it switch on the branch _tilesetreader and re-run the pip install -e .. To merge textured tilesets together, run something like:

 tileset-reader --path tileset1_path --merge tileset2_path tileset3_path --with_texture

There is no limit for the number of tileset paths after the --merge flag. The paths after the flags must be the paths to the root of each tileset. All the tilesets must be of the following form:

root
      tileset.json
      tiles
            .b3dm files
            .png files (textures)

(this is the hierarchy created by our Tilers)

Thanks for the workaround :+1:

clementcolin commented 2 years ago

Indeed there is one atlas per tile. We only open the textures of one tile at a time to create its atlas, as you can see here

However, we do not manage properly the memory, I think that each loaded texture of a tile stays in memory while creating the atlas, since we need to keep them around to sort them by size for our atlas algorithm. There should be a better solution to properly handle the memory.

Another workaround would be to reduce the number of buildings per tile here. Maybe we should create a flag to handle this parameter.

Also, keep in mind that the textures of buildings coming from the Data Grand Lyon are way heavier than classic data (since they are produced automatically from photogrammetry, without correction), and we are still limited by their memory weight (for all Lyon, ~20Gb). We need to look at texture compression for big data like that

jailln commented 2 years ago

Thanks for the details :)

Actually, from my tests, the process getting killed seems to depend on the total texture size of the input dataset and not on the size of the textures for a tile. This indicates that there is probably a memory leak related to textures management, but more tests should be done to validate this assertion. I have tried compressing the textures of Lyon beforehand (with jpegoptim) and I have also tried with other datasets than Lyon with lighter textures (not available open source unfortunately) and the same problem occurs when the total size of input texture is too important.

I will also try your workaround, thanks!

LorenzoMarnat commented 2 years ago

Hi @jailln , this PR should fix your problem. In fact, the texture images of the whole tileset were loaded in memory at once (memory usage was going up to 9Go for Lyon 1st borough).
Now, we load the images tile per tile, and free the meory after each tile. The memory usage should stay around 0,5 Go.

Hope this helps.

jailln commented 2 years ago

Thanks @LorenzoMarnat ! Now I get the following error when running the tiler with textures (--with_texture):

Traceback (most recent call last):
  File "/home/vincent/Documents/dev/py3dtilers/venv/bin/citygml-tiler", line 33, in <module>
    sys.exit(load_entry_point('py3dtilers', 'console_scripts', 'citygml-tiler')())
  File "/home/vincent/Documents/dev/py3dtilers/py3dtilers/CityTiler/CityTiler.py", line 181, in main
    tileset = city_tiler.from_3dcitydb(cursor, objects_type, split_surfaces)
  File "/home/vincent/Documents/dev/py3dtilers/py3dtilers/CityTiler/CityTiler.py", line 135, in from_3dcitydb
    objects_to_tile = self.get_surfaces_with_texture(cursor, cityobjects, objects_type)
  File "/home/vincent/Documents/dev/py3dtilers/py3dtilers/CityTiler/CityTiler.py", line 105, in get_surfaces_with_texture
    surface.geom = TriangleSoup.from_wkb_multipolygon(geom_as_string, associated_data)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/wkb_utils.py", line 50, in from_wkb_multipolygon
    triangles = triangulate(polygon, additionalPolygons)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/wkb_utils.py", line 316, in triangulate
    trianglesIdx = earcut(polygon2D, holes, 2)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/earcut.py", line 26, in earcut
    outerNode = eliminateHoles(data, holeIndices, outerNode, dim)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/earcut.py", line 282, in eliminateHoles
    eliminateHole(queue[i], outerNode)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/earcut.py", line 292, in eliminateHole
    outerNode = findHoleBridge(hole, outerNode)
  File "/home/vincent/Documents/dev/py3dtilers/venv/lib/python3.8/site-packages/py3dtiles/earcut.py", line 310, in findHoleBridge
    if hy <= p.y and hy >= p.next.y and p.next.y - p.y != 0:
AttributeError: 'NoneType' object has no attribute 'y'

I reproduced it with two datasets, one of them being from the Grand Lyon: all borough of 2018 vintage in CityGML, with textures. I don't get this error when running the tiler without the --with_texture argument.

LorenzoMarnat commented 2 years ago

Best I can do for now is to catch the AttributeError and skip the geometry. I think the error comes from the data, but I'll do some tests with Lyon's borough later.

When running the Tiler with texture, the Tiler uses get_surfaces_with_texture to create the geometries. This method doesn't handle AttributeError.
This error doesn't occur without the --with_texture flag because the error is caught in get_surfaces_merged

jailln commented 2 years ago

Oh ok thanks. If it can help, I didn't have this error before pulling the last changes yesterday (I don't remember which version I was running before pulling though but I think it was from one or two month max and my guess is that it was the version just before the last two PRs #37 and #38 so it may be related to these commits). Let me know if you need more information to reproduce this bug.

jailln commented 2 years ago

It works with your last modification (catching the AttributeError), so I'll close this issue.

jailln commented 2 years ago

Reopening this issue since it still happens. For instance with all the boroughs of Grand Lyon in 2018 with textures. Looking at the RAM on my computer when running the translation, it increases steadily from 4Go to 9.5Go, with bearings aroung 6.5, 8.5 and 9.5 Go and then it increases a lot suddenly and the process gets killed. Did you observed such a behaviour ? I will try to add some logs to see which part of the code is the culprit.

Note that it does not happen without textures.

LorenzoMarnat commented 2 years ago

When running the cityTiler on the 9 boroughs of Lyon 2018 with textures (on buildings), the memory sticks around 3 and 4 Go. It uses 3 Go for the geometries/UVs/instances and between 0.2 and 1 Go (depending on the tile) while creating the textures atlas.

However, I had an error with Pillow (the image library) caused by an image exeeding the limit of pixels:

Traceback (most recent call last):
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/venv/bin/citygml-tiler", line 11, in <module>
    load_entry_point('py3dtilers', 'console_scripts', 'citygml-tiler')()
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/CityTiler/CityTiler.py", line 185, in main
    tileset = city_tiler.from_3dcitydb(cursor, objects_type, split_surfaces)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/CityTiler/CityTiler.py", line 150, in from_3dcitydb
    return self.create_tileset_from_geometries(objects_to_tile, extension_name=extension_name)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Common/tiler.py", line 101, in create_tileset_from_geometries
    return create_tileset(objects_to_tile, self.args.lod1, create_loa, self.args.loa, extension_name, self.args.with_texture)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Common/tileset_creation.py", line 17, in create_tileset
    create_tile(root_node, tileset, centroid, centroid, 0, extension_name)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Common/tileset_creation.py", line 30, in create_tile
    content_b3dm = create_tile_content(objects, extension_name, node.has_texture())
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Common/tileset_creation.py", line 72, in create_tile_content
    tile_atlas = Atlas(objects)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Texture/atlas.py", line 16, in __init__
    textures = objects_to_tile.get_textures()
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/CityTiler/citym_cityobject.py", line 64, in get_textures
    texture_dict[object_to_tile.get_id()] = uri_dict[uri].get_cropped_texture_image(object_to_tile.geom.triangles[1])
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Texture/texture.py", line 20, in get_cropped_texture_image
    image = self.cropImage(self.image, uvs)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/py3dtilers/Texture/texture.py", line 42, in cropImage
    cropped_image = image.crop((minX * texture_size[0], minY * texture_size[1], maxX * texture_size[0], maxY * texture_size[1]))
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/venv/lib/python3.8/site-packages/PIL/Image.py", line 1147, in crop
    return self._new(self._crop(self.im, box))
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/venv/lib/python3.8/site-packages/PIL/Image.py", line 1165, in _crop
    _decompression_bomb_check(absolute_values)
  File "/mnt/c/Users/CColin/Documents/GitHub/py3dtilers/venv/lib/python3.8/site-packages/PIL/Image.py", line 2890, in _decompression_bomb_check
    raise DecompressionBombError(
PIL.Image.DecompressionBombError: Image size (618579682336 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
jailln commented 2 years ago

Thanks for trying! That's weird, did you do the same process than me? :

LorenzoMarnat commented 2 years ago

I imported all ...bati.gml from Lyon 1 to Lyon 9 in a localy hosted 3DCityDB, then ran the same comand (with and without CRS change).

LorenzoMarnat commented 2 years ago

I will try to delete as much instances as I can during the tileset creation to save memory. For example, the triangles/UVs can be destroyed once they are written in a tile as binary. This should reduce the memory usage a bit.

Some enhancements would be to:

The best way to handle really large amount of data in the current state is to create one tileset per borough. Then, use the tileset-merger command (not merged yet on master) to merge all the boroughs together. I didn't test the TilesetMerger on large tilesets yet, but its memory usage should stay low since it doesn't load triangles/images in memory.

jailln commented 2 years ago

After looking into it, here are my findings (with all the boroughs of the Lyon 2018 vintage):

Now, some questions :slightly_smiling_face: :

LorenzoMarnat commented 2 years ago

I actually don't know whats the best way to handle the texture atlas. Maybe limiting the size of the atlas would be a good start. We could distribute the textures of a tile in many atlas (of an arbitrary maximum size) to limit the size of each atlas image. A tile would then use 1, 2 or more atlas. To do this, we only need to match each ObjectToTile with the atlas containing its texture (we do something similar to handle the materials when a tile has different materials). Note that I have no idea how having many atlases for a tile would impact the loading performances (if it does impact the performances).

jailln commented 2 years ago

I actually don't know whats the best way to handle the texture atlas. Maybe limiting the size of the atlas would be a good start. We could distribute the textures of a tile in many atlas (of an arbitrary maximum size) to limit the size of each atlas image. A tile would then use 1, 2 or more atlas. To do this, we only need to match each ObjectToTile with the atlas containing its texture (we do something similar to handle the materials when a tile has different materials).

Yes, that's a good idea! From what I read, 4,096 x 4,096 pixels seems like a good maximum size, but it depends on devices where you will render the texture so we should maybe make it configurable.

Note that I have no idea how having many atlases for a tile would impact the loading performances (if it does impact the performances).

Generally one texture implies one draw call and generally you want to reduce the number of draw calls to your graphic card to improve performances. However, the maximum texture size supported by a device depends on the graphic card, so it won't work if you try to send to big textures.

clementcolin commented 2 years ago

First of all, I do not think that we can properly handle texture without introducing a LOD system that takes into account the atlas size of a tile and limit its size.

For example, if we set a limit of 4.096 x 4.096 pixels as Jailln recommended, we group Features in the same tile until the texture size reach the limit. Then, when grouping the tiles together (usually 4 by 4 to respect quadtree specification) to create LOD, we can respect this limit by grouping also the atlases and compressing them (by a factor 4 if we group tiles 4 at a time). We would then need to change the UV for each LOD.

In the tiler, if we do it correctly (i.e changing the workflow), we then should be able to only have to handle 5 atlas at a time (4 child and its parent).

As the atlases will be less and less precise, we need to find an objective criterion to stop the compression of atlases and drop the texture (maybe replacing them with style or simple color). Maybe by the number of features per Tile ? or the geographic size of the tile?

Another thing that may be interesting, more linked to the visualization part, is the usage of mipmap.

jailln commented 2 years ago

I totally agree that we should look at such mechanisms to improve performances. One possibility would be to implement a new LODs mechanism where each level store simplified versions of their n+1 level (simplified geometry and texture) (see #53). Regarding mipmaps, we could store them with the ktx2 texture format (see #46).

LorenzoMarnat commented 2 years ago

This PR reduce the memory usage of the CityTiler by creating the tileset tile by tile. Compression, tile/atlas size limit and new LOD mechanism will be done in the future (issues are open).