CadQuery / cadquery

A python parametric CAD scripting framework based on OCCT
https://cadquery.readthedocs.io
Other
3.15k stars 289 forks source link

Graceful exit on broke step file #1521

Open qbamca opened 7 months ago

qbamca commented 7 months ago

We run Flask app that converts step files to stl for further processing. When we were testing the app we noticed that when supplied with manually broken step file, the cadquery library crash gracefully our Flask app. No Error is raised. We tried catching SystemExit with no avail. The app just stops. I've traced cadquery inner works up to this point: image at which the reader.TransferRoot(i + 1) closes our app. I don't know how to trace it further inside OCP, so I can't investigate further.

The problem is that, such files can close our service, we have no means to prevent it and results in service downtimes while it's being restarted. We need to find a way to prevent it from closing the app.

To Reproduce

repro repo It's a flask app. Run main.py and post file BROKE_FILE.STEP

Backtrace

no backtrace

image

Environment

OS: Ubuntu 22.04.2 LTS python 3.10.12 cadquery v2.3.1 cadquery is installed using pip.

Using: Python interpreter

adam-urbanczyk commented 7 months ago

I tried loading your step directly via CQ and nothing crashes, but I do get an infinite loop it seems. You'll have to raise an issue with OCCT I'm afraid. Is the STEP actually malicious?

lorenzncode commented 7 months ago

Strange, OCCT Draw shows "Could not read file BROKEN_FILE.step , abandon" this corresponds to readstat != IFSelect_RetDone. I also tried with C++ and again ReadFile returns IFSelect_RetError (tested on OCCT 7.7.2, 7.8.0).

Edit: Ignore this comment. My mistake in testing (typo). Actually I do get the same IFSelect_RetDone status.

adam-urbanczyk commented 7 months ago

Interesting, on win I get an infinite loop with 7.7.1 and a silent crash with 7.7.2.

lorenzncode commented 7 months ago

I can reproduce the crash on linux with CQ and 7.7.2.

OCCT Draw does not crash; it does report ERR messages in the output (7.7.2):

Reduced model for translation without additional info will be used 
**** ERR StepFile : Incorrect Syntax : Fails Count : 1 ****
*** ERR StepReaderData : Unresolved Reference : Fails Count : 15 ***

image

The same ERR messages are printed with CQ with:

from OCP.Message import Message, Message_Gravity
for printer in Message.DefaultMessenger_s().Printers():
    printer.SetTraceLevel(Message_Gravity.Message_Info)
qbamca commented 7 months ago

I wouldn't call STEP file malicious. We did remove a line or two by hand to simulate the broken file, though. The errors @lorenzncode mentioned, we've seen them already with files that were uploaded by users but it was in different environment where silent crashes were not a problem and we didn't notice the problem then. My point is, we suspect we might have to deal with such files in production and files broken in any way, should not result in silent crash.

adam-urbanczyk commented 7 months ago

@lorenzncode interesting. I checked with draw from 7.5 I do get an infinite loop of sorts. Draw from 7.8 works fine (I don't have 7.7.2 at hand). Looking quickly at the code of OCCT I was not able to tell how to extract the errors programmatically. Would you know?

lorenzncode commented 7 months ago

@adam-urbanczyk I found the messages can be extracted using Message_PrinterToReport.

import io
from OCP.Message import Message, Message_Gravity, Message_PrinterToReport
from OCP.STEPControl import STEPControl_Reader

msger = Message.DefaultMessenger_s()
newprinter = Message_PrinterToReport()
report = newprinter.Report()
msger.AddPrinter(newprinter)

reader = STEPControl_Reader()
reader.ReadFile("BROKEN_FILE.STEP")

f = io.BytesIO()
alerts = report.GetAlerts(Message_Gravity.Message_Info)
for alert in alerts:
    alert.Attribute().DumpJson(f)

print(f.getvalue())
f.close()
lorenzncode commented 7 months ago

I get the same stacktrace in C++ and Draw with

OSD::SetSignal(OSD_SignalMode_Unset, Standard_False);

No crash with OSD::SetSignal();

In Draw, default is set here: https://github.com/Open-Cascade-SAS/OCCT/blob/cec1ecd0c9f3b3d2572c47035d11949e8dfa85e2/src/Draw/Draw_BasicCommands.cxx#L1004

lorenzncode commented 7 months ago

I opened an OCCT issue https://tracker.dev.opencascade.org/view.php?id=33603

@qbamca Can you also share the unmodified STEP file? Probably not useful but could see the diff.

adam-urbanczyk commented 7 months ago

@lorenzncode what is the current conclusion regarding CQ? I.e. can we change some settings to mitigate the issue?

lorenzncode commented 7 months ago

@adam-urbanczyk Pending a fix in OCCT itself, yes it would be possible to mitigate the issue in CQ by conditionally aborting importStep based on monitoring the Message_Info level alerts.

The simplest way might be to provide an option to abort on any alert (ignoring the contents of the messages). If we find it results in false positives then it might be required to check for specific messages.

In C++ the issue can be mitigated with OSD::SetSignal(). I don't know if that is possible in OCP?

dpasukhi commented 7 months ago

Thank you for the report.

dpasukhi commented 7 months ago

This kind of issue is not stable. The result is undefined if you will disable catching signals. And there is no specific message information for that kind of access problem. If you have any kind of crash, we fix that fast. But for a long time, it happened rarely. And only on non-correct STEP files.

lorenzncode commented 7 months ago

@dpasukhi Thank you for your inputs!

@adam-urbanczyk Since crashes are expected to be rare perhaps CQ can separate the STEPControl_Reader().ReadFile() part of importStep and let the user implement a safer mode if they need it. This would provide the user the opportunity to monitor messages and decide to proceed with reading the step file. I'll share an example.

qbamca commented 7 months ago

@qbamca Can you also share the unmodified STEP file? Probably not useful but could see the diff.

I've added the original file in the repo. 3007091.STEP We also confirmed that we see same results with files uploaded by users, without any modification on our part so the problem is not artificial.

BTW, thank you all for looking into this.

qbamca commented 7 months ago

can you give me a short status report of this issue? i'm not sure how to proceed.

adam-urbanczyk commented 7 months ago

Same here. Maybe take a look if #1525 helps you. It seems that there is no one liner that would solve this. You might need to handle such issues on your side using OCCT directly.

dpasukhi commented 7 months ago

The ticket is one of the first to fix. I wait for compleating another ticket for the developer. Crash is first priority for maintance activities. I hope we will prepare fix this week or beginning of the next week. Patch will be based on master branch. You will need to cherry pick or rebase after resolving issue.

qbamca commented 3 months ago

hi, it's been a while. Is there any progress with this issue?

dpasukhi commented 3 months ago

From OCCT side the issue was resolved a long time ago. (If you found new issues, please let me know)

adam-urbanczyk commented 3 months ago

Probably when this lands in the OCCT release and when this release is wrapped and used in CQ. I assume 7.9, will take a while.

dpasukhi commented 3 months ago

Changes included into 7.8.1. 7.8.0 and 7.8.1 are easy to replace just by replacing dll or so files. No conflicts.

adam-urbanczyk commented 3 months ago

Then stay tuned (#1589)

dpasukhi commented 3 months ago

7.7.2 -> 7.8.n is a little more complicated task. There was a reorganization of some things, see https://dev.opencascade.org/doc/overview/html/occt__upgrade.html#upgrade_occt780 But great to know :)

qbamca commented 3 months ago

thank you guys!