3drepo / 3drepobouncer

A C++ library providing 3D Repo Scene Graph definition, repository management logic and manipulation logic. It is is essentially the refactored 3DRepoCore and (parts of) 3DRepoGUI
GNU Affero General Public License v3.0
29 stars 13 forks source link

A small IFC file is stuck in an infinite loop generating geometry #633

Closed carmenfan closed 1 year ago

carmenfan commented 1 year ago

Description

A customer IFC file of ~6MB is timing out. Upon inspection, it seems to be stuck on an infinite loop whilst trying to generate geometry.

Link to the model: https://asitesol.sharepoint.com/:f:/s/3DRepoTeamFolder/EvYJY_cTSr1CjSS6grflBRcBtCXPMDaIg6hAuvnafjogfQ?e=LSPGu5

The log is stuck on this stage forever until times out: image

My suspicion is that it is trying to generate too high level LOD on a some tiny curves, resulting in the same vertice is being generated and it never hits the loop terminating condition (have seen something similar before in IFC Open shell years ago). So tweaking the LOD configurations may be the easiest fix as we are looking into this anyway with #632

sebjf commented 1 year ago

Recreated this issue in my local bouncer, and found where to set the angular resolution and linear tolerance in the import settings. These are the same as used by ODA, which is nice for #632 (they both seem to use a BRep underneath).

Unfortunately changing the LOD doesn't help the issue.

Using IfcConvert from the 0.7.0 build from IfcOpenShell website also locks up indefinitely, so the problem is in the latest version of the library.

Building IfcOpenShell from source and running IfcConvert under the debugger shows it breaks in RelocatePCurvesToNewUOrigin. (ShapeUpgrade_UnifySameDomain.cxx line ~360)

What happens in this function is that it tries to evaluate a sequence of curves on a face (edges in a contour). At the start of the loop it tries to remove the last evaluated edge from the list of edges (theVEmap), which should eventually leave this list empty. This list is never emptied however, causing the loop to run infinitely.

The RemoveEdgeFromMap method works by checking the edge against each entry using the IsSame member, which seems to be a superclass method.

The debugging symbols lose where they are here so its not obvious why this call fails. The method itself is in OCCT so not easy to modify.

carmenfan commented 1 year ago

@sebjf could it be that the problem lies in OCCT?

sebjf commented 1 year ago

Hi @carmenfan the problem is definitely in OCCT.

The fault occurs in Unify Same Domain Build() stage, which merges coincident faces/edges that lie on coincident surfaces/curves.

Specifically, its in RelocatePCurvesToNewUorigin() function.

This method iterates through all the Edges in a Shape (which sit on a curved surface). For each, it gets the Curve formed by the projection of the Edge (CurEdge) on the Surface (CurPCurve) and adds this to a Map, while also storing the starting Parameter, and the Point on the Curve at this Parameter.

In a second nested loop it iterates through all Edges that start at the same Vertex as CurEdge (anEdge) using a map of the edges that start at each vertex in the Shape theVEmap. It performs a similar operation for these (creating a new Curve, and adding it to a Map).

When this second loop exits, it picks up the next Edge in the Shape and repeats.

In the second nested loop, CurEdge is added to a map, theUsedEdges and removed from theVEmap. The outermost loop terminates when it can't find an Edge that does not exist in theUsedEdges.

The problem is the termination criteria of the second nested loop. This loop only terminates when there are no more edges in theVEmap for the starting vertex of CurEdge.

In Concrete.ifc however, this method encounters a 'CurEdge' that never triggers this criteria, causing it to get stuck in the loop indefinitely.

It is not entirely obvious why this criteria is never reached, but it seems to be an interaction between two parts of this method:

  1. The RemoveEdgeFromMap() call which should remove 'CurEdge' from 'theVEmap' and therefore make it smaller on each iteration, and..
  2. A check for whether the start Point of anEdge matches (is within a certain distance of) that of CurEdge.

This second check occurs before 'CurEdge' is updated, meaning if it fails 'CurEdge' goes straight back to be removed from 'theVEmap' - if this operation does not decrement 'theVEmap' (which it doesn't) the loop gets stuck.

I believe the intent behind how this is written is that it if anEdge is not valid for some reason, it doesn't try to create a new Curve for it. The problem is that if calling RemoveEdgeFromMap() doesn't result in theVEmap being reduced in size for whatever reason, this check will prevent CurEdge from advancing.

What is not clear to me at the moment is whether it should be possible to call RemoveEdgeFromMap() and not actually remove anything, because if not, a simple fix is to just report a failure to remove the edge, and terminate the inner loop here.

Another possible fix is to gate the curve updates differently, and make sure that 'CurEdge' is always updated by this loop.

carmenfan commented 1 year ago

@sebjf would updating OCCT fixes this? (building it takes forever though 😆). Might worth checking if they've identified the issue else where and have since fixed it

sebjf commented 1 year ago

As I've found 😆 (I had to rebuild with optimisation off to investigate properly!)

It is a known issue: https://tracker.dev.opencascade.org/view.php?id=32949

Reported last year but without working steps to reproduce. Someone posted another example a couple of months ago, but no activity since.

sebjf commented 1 year ago

A straightforward fix is to detect the exact condition that leads to the infinite loop (i.e. CurEdge is not removed from the theVEmap, and is not changed by the third nested loop).

This can be done by modifying RemoveEdgeFromMap to return how many elements have been removed, and storing the value of CurEdge before it enters the curve construction loop.

If it's the case that RemoveEdgeFromMap should never fail to remove an edge, this test could be made simpler, or better the function restructured to avoid the possibility in the first place. However, that requires a good knowledge of the intent of the method (its not obvious whether RemoveEdgeFromMap should be able to fail, or the third loop exit without modifying CurEdge).

As it is without this info, the explicit test is safer. It is also highly efficient, using properties that are practically already available, just storing them for a little longer, and sits in a couple of lines at the end of the method.

Concrete.ifc with the modifications:

image

Dev Notes

OpenShell 0.0.7 will make a copy of all dependencies locally, as source, which is nice. So modifications of OCCT can be attempted in-place. The debugger will find the source of the dependencies on its own but they won't automatically rebuild.

The OCCT project will have a Visual Studio solution already created for it, e.g. in

D:\3drepo\bouncer\IfcOpenShell-0.7.0\_deps\occt_git\_build-vs2019-Win32\OCCT.sln

This builds into

D:\3drepo\bouncer\IfcOpenShell-0.7.0\_deps\occt_git\_build-vs2019-Win32\win32\vc14\libi

The IfcOpenShell project takes the libraries from

D:\3drepo\bouncer\IfcOpenShell-0.7.0\_deps-vs2019-Win32-installed\opencascade-7.5.3\win32\lib

So, when modifying OCCT, the modified libraries must be copied from _deps\occt_git\_build-vs2019-Win32\win32\vc14\libi into _deps-vs2019-Win32-installed\opencascade-7.5.3\win32\lib manually.

sebjf commented 1 year ago

Bouncer Fix

Regarding bouncer compatibility,

  1. Builds of bouncer with OCCT-7.5.3, but the existing OpenShell-0.6.0 binaries, fail with an access violation.
  2. (OCCT-7.5.3 is the version targeted by the IfcOpenShell-0.7.0 release, which is only a couple of minor versions behind master.)
  3. Building bouncer with the IfcOpenShell-0.7.0 available in binary form from the website fails with API incompatibilities.
  4. Building IfcOpenShell-0.6.0 with our version of OCCT-7.5.3 fails with a linker error (nothing to do with our OCCT, its creating too many linker objects even with incremental linking off).
  5. Building bouncer with our version of OpenShell-0.6.0 works.

Using build-deps of OpenShell-0.6.0 allows us to see OpenShell-0.6.0 is using OCCT V7_3_0p3. This is built as a static library however, as is OpenShell-0.6.0 as a whole, by the included build scripts.

Using depends, we can confirm that the prebuilt bouncer binaries dynamically link the OCCT binaries.

Unfortunately, building V7_3_0p3 does not allow the existing OpenShell-0.6.0 libraries to run, because of missing entry points (that is, they must not have been built with 7.3.0 and there is no way to find out which version they were built with).

Building IfcOpenShell

The SHA is b8d36a63a29f37193f52027b421607f7c37434ce.

The cmake changes (e.g. made through cmake-gui) are summarised below:

(Note specifically that SCHEMA_VERSIONS will not be detected automatically and must be added using Add Entry)

Commandline options:
-DBUILD_IFCPYTHON:BOOL="0" -DBUILD_GEOMSERVER:BOOL="0" -DCMAKE_INSTALL_PREFIX:PATH="D:\3drepo\bouncer\IfcOpenShell-0.6.0\install" -DLIBXML2_LIBRARY:FILEPATH="LIBXML2_LIBRARY-NOTFOUND" -DBUILD_EXAMPLES:BOOL="0" -DLIBXML2_INCLUDE_DIR:PATH="LIBXML2_INCLUDE_DIR-NOTFOUND" -DSCHEMA_VERSIONS:STRING="2x3;4" -DBoost_INCLUDE_DIR:PATH="D:\3drepo\bouncer\boost_1_59_0" -DBUILD_IFCGEOM:BOOL="1" -DIFCXML_SUPPORT:BOOL="0" -DCOLLADA_SUPPORT:BOOL="0" -DOCC_INCLUDE_DIR:PATH="D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\inc" -DBUILD_SHARED_LIBS:BOOL="1" -DCMAKE_CONFIGURATION_TYPES:STRING="RelWithDebInfo" -DOCC_LIBRARY:FILEPATH="D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\win64\vc14\libi" -DLIBXML2_XMLLINT_EXECUTABLE:FILEPATH="LIBXML2_XMLLINT_EXECUTABLE-NOTFOUND" -DlibTKernel:FILEPATH="D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\win64\vc14\libi\TKernel" 

Cache file:
BUILD_IFCPYTHON:BOOL=0
BUILD_GEOMSERVER:BOOL=0
CMAKE_INSTALL_PREFIX:PATH=D:\3drepo\bouncer\IfcOpenShell-0.6.0\install
LIBXML2_LIBRARY:FILEPATH=LIBXML2_LIBRARY-NOTFOUND
BUILD_EXAMPLES:BOOL=0
LIBXML2_INCLUDE_DIR:PATH=LIBXML2_INCLUDE_DIR-NOTFOUND
SCHEMA_VERSIONS:STRING=2x3;4
Boost_INCLUDE_DIR:PATH=D:\3drepo\bouncer\boost_1_59_0
BUILD_IFCGEOM:BOOL=1
IFCXML_SUPPORT:BOOL=0
COLLADA_SUPPORT:BOOL=0
OCC_INCLUDE_DIR:PATH=D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\inc
BUILD_SHARED_LIBS:BOOL=1
CMAKE_CONFIGURATION_TYPES:STRING=RelWithDebInfo
OCC_LIBRARY:FILEPATH=D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\win64\vc14\libi
LIBXML2_XMLLINT_EXECUTABLE:FILEPATH=LIBXML2_XMLLINT_EXECUTABLE-NOTFOUND
libTKernel:FILEPATH=D:\3drepo\bouncer\OCCT\opencascade-7.3.0\install\win64\vc14\libi\TKernel

This allows OpenShell-0.6.0 to build, and bouncer to link with OpenShell-0.6.0 and OCCT-7.3.0. However, in this configuration, Concrete.ifc loads in fine!

Therefore, the bug must have been introduced between 7.3.0 and 7.5.0, and the existing OpenShell-0.6.0 binaries built using this version. This does mean however, that without knowing which version it was, we may introduce regression bugs using any given build of OCCT. The one that travis takes seems to be 7.5.0, based on the file naming.

Building a checkout of V7_5_0 (exactly) is binary compatible with the existing OpenShell-0.6.0 build, and the infinite loop manifests. Applying the fix above to this branch allows the existing bouncer/openshell to work with Concrete.ifc.

PR

Regarding the PR, the file containing the fix has undergone a number of changes since 7.5.3, so we will probably need to re-create the issue in master and make a PR based there, while duplicating the fix in our own fork of 7.3.0 until we upgrade at a later date.

Using a modified version of IfcOpenShell-0.7.0, we have extracted just the shape causing the issue, however this does not import into Draw.exe built from the same tag/branch as used by IfcOpenShell-0.7.0, because the importer doesn't know about file version 3, even though it wrote it.

Adding a snippet to the beginning of DRAWEXE allows us to recreate the issue inside OCCT-7.5.3 (separate to IfcOpenShell, and the customer's file) and demonstrate the fix works.

However, adding the same snippet in master fails. Our fix works, but then another part of the application fails with an access violation. This time in another area, because the code checks if an edge is null, and if so does something, but assumes that thing succeeds when it doesn't - basically the same type of bug that we just fixed! (At least this one is fairly straightforward to fix as well...)

#include <BinTools.hxx>
#include <TopoDS.hxx>
#include <TopoDS_Edge.hxx>
#include <ShapeUpgrade_UnifySameDomain.hxx>
{
  TopoDS_Shape s;
  BinTools::Read(s, "D:\\3drepo\\ISSUE_633\\shape.brep");
  ShapeUpgrade_UnifySameDomain usd(s);
  usd.Build();
}