dcoeurjo / pc2vol

1 stars 1 forks source link

pc2vol produces no file output when run on a larger point set #1

Open EricBoix opened 9 months ago

EricBoix commented 9 months ago

Build pc2vol within a docker container

docker --version       # 24.0.7, build afdd53b
docker build -t dgtal/pc2vol https://github.com/VCityTeam/UD-Reproducibility.git#:Computations/3DTiles/ElaphesCave/DockerContexts/pc2volContext

Test it on the bunny

cd /tmp
mkdir junk && cd junk
wget https://raw.githubusercontent.com/dcoeurjo/pc2vol/main/bunny.pts
docker run --rm -v `pwd`:/data -it dgtal/pc2vol pc2vol -i /data/bunny.pts -o /data/bunny.vol

that outputs

Point cloud bbox: (-0.5, -0.5, -0.5) (30.5, 23.5, 30.5)
Digital domain = 34300 [HyperRectDomain] = [(-2, -2, -2)]x[(32, 25, 32)]
New Block [WindingNumber BVH]
   Min/max point area : 0 -- 12.2927
EndBlock [WindingNumber BVH] (40474.7 ms)
Number of queries = 34300
Number of voxels = 6897
New Block [Exporting]
EndBlock [Exporting] (22.4085 ms)

and dumps a bunny.vol file outpue

wc bunny.vol    # 14      53    1297 bunny.vol

All is nice.

Now let's try providing a heavier output by applying pc2vol on this cave point cloud

wget https://dataset-dl.liris.cnrs.fr/elaphes-cave/Exp-Cloud-ELAPHS-94M-sanitized-just-normals-with-CloudCompare.pts
head -n 1 Exp-Cloud-ELAPHS-94M-sanitized-just-normals-with-CloudCompare.pts
# returns 28.4301692390 0.0140637016 -4.5360868835 0.971552 -0.229684 0.057728

and this point set thus looks having the X, Y, Z, Nx, Ny, Nz format that pc2vol expects.

But when running

docker run --rm -v `pwd`:/data -it dgtal/pc2vol pc2vol -i /data/Exp-Cloud-ELAPHS-94M-sanitized-just-normals-with-CloudCompare.pts -o /data/Exp-Cloud-ELAPHS-94M-sanitized-just-normals-with-CloudCompare.vol

the output goes

Point cloud bbox: (-9.9391, -3.23727, -6.96334) (73.7245, 12.9577, 4.70943)
Digital domain = 26100 [HyperRectDomain] = [(-11, -5, -8)]x[(75, 14, 6)]

and the container exits whithout producing a file output. What am i doing wrong ?

dcoeurjo commented 9 months ago

Nothing wrong, it seems to be related to the memory consumption issue for a CGAL tool we are using (discussed by email with Eric L and Martial) that may cause docker issues

dcoeurjo commented 9 months ago

If you run the commands outside docker (on your OS), you should be able to see the memory issue (unrelated to DGtal nor the pc2vol thing)

elombardi2 commented 9 months ago

There are 2 different and unrelated memory issues:

  1. the CGAL lib used by pc2vol tool has a memory issue as stated by @dcoeurjo
  2. on Mac only the amount of RAM usable by docker is limited, because docker runs inside a virtual machine ; this issue doesn't exist on Linux, nor on PAGODA

See also this comment.

dcoeurjo commented 9 months ago

thx @elombardi2

FYI: unless I've missed something, I don't have access to the GRIM repo

mtola commented 9 months ago

EBO, if you increase the memory allocated to Docker under Mac to 64 GB (ok I admit you need a racing machine !), normally it should work. ;-)

By the way, can you please add DCO to the GRIM project ? Thanks Update : done by ELO

EricBoix commented 9 months ago

@mtola Well, it happens that my usage of pc2vol has to be container based (my operational context relies on the Argo Workflows platform for scaling-up/reproducibility concerns). And my dev context happens to be an OSX desktop with a (reasonably) limited 32G configuration. So raising to 64 GB is alas not an option for me since dev precedes ops... ;-) Besides I'm not sure to understand why being on some particular OS could/should be relevant when trying to (partially) abstract one-self from the underlying OS by working with containers.

I understand that, at some usage level, the RAM has to be a limit. But I have a hard time understanding why this limitation has to be platform specific. And if that were to be the case, can't pc2vol trap memory allocation limits (be it in some underlying CGAL library) and emit an informative error message ?

In order to assert for (yet) a possible side-effect (on RAM usage) of working with an underlying VM on OSX, I tried to run pc2vol on (almost) arbitrarily tiny data sets: refer here for a tiny context for working on tiny data sets. Alas it seems that even with 5k pc inputs, pc2vol produces no output. So, however context specific the problem might be, it seems to arise without any memory footprint threshold. Which can be quite surprising...

Note: is pc2vol computing the correct bounding box ?

When one tries to run pc2vol on the tunetgen_n_30.xyz input file with

docker run --rm -v `pwd`/data:/data -it dgtal/pc2vol /home/pc2vol/pc2vol/build/pc2vol -i /data/tunetgen_n_30.xyz -o /data/tunetgen_n_30.vol

then pc2vol reports

Point cloud bbox: (100, 100, -17.3821) (1.84294e+06, 5.17647e+06, 17.1562)

that seems incorrect since

grep "^1842" -v data/tunetgen_n_30.xyz

has an empty output (meaning all X coordinates start with 1842 that is bigger than the lower value of the bounding box that is reported as being 100).

Note: the --gridstep flag does not seem to help on the geographic coordinates

The input files use geographic coordinates which the --gridstep flag does not seem to help. For example

docker run --rm -v `pwd`/data:/data -it dgtal/pc2vol /home/pc2vol/pc2vol/build/pc2vol -i /data/tunetgen_n_30.xyz -o /data/tunetgen_n_30.vol --gridstep 10000

yields

Point cloud bbox: (100, 100, -17.3821) (1.84294e+06, 5.17647e+06, 17.1562)
Digital domain = 489740 [HyperRectDomain] = [(-1, -1, -2)]x[(186, 519, 2)]
New Block [WindingNumber BVH]

and no output file.