IfcOpenShell / IfcOpenShell

Open source IFC library and geometry engine
GNU Lesser General Public License v3.0
1.86k stars 730 forks source link

`Default Building: Unexpected 'f' at offset 817790` error #5302

Closed Mannshoch closed 1 month ago

Mannshoch commented 2 months ago

Problem from:https://github.com/FreeCAD/FreeCAD/issues/16063 The Problem is not FreeCAD but ifcopenshell

Problem is, I'm not allowed to provide the File.

I downloaded ifcconvert v7 and v8

output on ifcconvert.exe v7:

IfcOpenShell IfcConvert v0.7.0-f7c03db75 (OCC 7.5.3)
Scanning file...
Done scanning file
Parsing input file took 0 seconds
Creating geometry...
Done creating geometry (12 objects)

Log:
[Error] [2024-09-03 17:04:00] Unexpected 'f' at offset 817790

Conversion took 4 seconds

And a working STP is made.

The same on ifcconvert.exe v8

IfcOpenShell IfcConvert 0.8.0 (OCC 7.5.3)
Scanning file...
#10000

no file is created.

For the record, the file open without problem in Bimviewer grafik

aothms commented 2 months ago

[Error] [2024-09-03 17:04:00] Unexpected 'f' at offset 817790

Would be interesting to know what's at character position 817790. Maybe you have Notepad++ (or another text editor installed) that allows you to easily jump to that position. https://superuser.com/a/1288569

If possible paste the line line that's there.

There have been a couple of reports that v0.8 is slightly less permissive on certain situations that were accepted in v0.7.

Mannshoch commented 1 month ago

I asked Copilot

#!/bin/bash

# Datei und Offset definieren
file="deine_datei.txt"
offset=817790

# Zeile und Position im Offset finden
awk -v offset=$offset '{
    count += length($0) + 1
    if (count >= offset) {
        line_number = NR
        char_position = offset - (count - length($0) - 1)
        print "Der Offset " offset " befindet sich in Zeile " line_number " an Position " char_position
        exit
    }
}' $file

(position 77 in this line) #10565 = IFCBUILDINGELEMENTPROXY ( '0LhvikQPL3yAPWgZoKt7o$',#2,'Grundst\X2\00fc\X0\ck_Grundstück','','Grundst\X2\00fc\X0\ck_Grundstück',#10567,#10566,'6A899D20-9001-4FD3-9A-EB-5E88A7396E9E',.ELEMENT. ) ;

aothms commented 1 month ago

Thanks for that!

It's the lowercase fc. These need to be uppercase.

Also, it's really peculiar that the exporter escapes the first u umlaut, but leaves the 2nd unescaped which is of course also invalid.

Grundst\X2\00fc\X0\ck_Grundstück
       ^^^^^^^^^^^^          ^

So we'll fix this in IfcOpenShell to the extent that reading this model skips over these erros as before and I'll leave the issue open in the meantime. But be aware that it's not top priority given the invalid nature of this file.

Please report these issues with your vendor. And let them now about the bSI validation service which can be used free of charge https://validate.buildingsmart.org/

aothms commented 1 month ago

Actually, thinking about this a bit longer. While the v0.8 behaviour is not very graceful, maybe it's better than the v0.7 behaviour. I think what's happening in v0.7 is that it just stops parsing after this syntax error and runs conversion on the file up to that point because we never implemented a way to recover parsing from string encoding errors. So while you might get a .stp file out of it it would most likely be very incomplete. I think the correct behaviour is to terminate with a clear error message.

Mannshoch commented 1 month ago

@aothms vendor is Solidworks. I have no connection to them, sorry.

Do I understand it right that the Problem here appears inside a text that is not important for the import of bodies? So i could edit this line and write something else? In such case If errors are inside such not import and data I propose to either replace the text with kind of IFC-texterror or simply remove all character that are wrong e.g. Grundstck_Grundstck

aothms commented 1 month ago

or simply remove all character that are wrong

The problem is that this is a streaming text format, so once something like this happens you don't really 'know' anymore whether you're inside or outside of the string. Granted, in this place it's pretty clear for us humans, but generally it's harder to recover from syntax errors.

Mannshoch commented 1 month ago

I tested the replacing, ü appears on a second position also.

IfcConvert.exe D:\Haus-repaired.IFC D:\Haus-repaired.dae
IfcOpenShell IfcConvert 0.8.0 (OCC 7.5.3)
Scanning file...
Done scanning file
Parsing input file took 0 seconds
Creating geometry...
Done creating geometry (28 objects)

Log:
[Error] [2024-09-05 12:00:42] Inconsistent aggregate valuation while attempting to append class std::vector<int,class std::allocator<int> > to an aggregate of class std::vector<double,class std::allocator<double> >

Conversion took 5 seconds

(ifcconvert v0.7.0-f7c03db75 works here without error)

aothms commented 1 month ago

Yes, so this is a good thing. You can't mix tokens of different types in express aggregates and an integer 0 and a real 0. token are different things, however silly it may sound. But I've implement some compatibility to cast to most general type, while also still printing the error.

ISO-10303-21;
HEADER;
FILE_DESCRIPTION(('ViewDefinition [CoordinationView]'),'2;1');
FILE_NAME('','2024-09-11T11:44:34',(),(),'IfcOpenShell v0.7.0-f7c03db75','IfcOpenShell v0.7.0-f7c03db75','');
FILE_SCHEMA(('IFC4'));
ENDSEC;
DATA;
#1=IFCCARTESIANPOINT((0.,0,0.));
ENDSEC;
END-ISO-10303-21;
>>> import ifcopenshell
>>> f = ifcopenshell.open('inconsistent.ifc')
>>> f[1]
#1=IfcCartesianPoint((0.,0.,0.))
>>> print(ifcopenshell.get_log())
[Error] [2024-09-11 14:22:25] Inconsistent aggregate valuation while attempting to append int to an aggregate of double