tpaviot / pythonocc-core

Python package for 3D geometry CAD/BIM/CAM
GNU Lesser General Public License v3.0
1.39k stars 379 forks source link

Allow reading/writing step files to string #1299

Closed anuraaga closed 9 months ago

anuraaga commented 9 months ago

In #1096 there is great interest in being able to marshal step files without temporary files on disk. While *Stream methods are provided, they are awkward to use with Python files, and I think in some places in the library this is acknowledged and methods that read and write strings are provided. Strings do require buffering so may not be ideal compared to a stream in all cases, but in many cases it actually already is buffered, and it is still a significant improvement over temporary files for many of the cases listed in the issue, it would be great to be able to use this without files.

I mostly copied the pattern of breptools

https://github.com/tpaviot/pythonocc-core/blob/master/src/SWIG_files/wrapper/BRepTools.i#L855

If there's a reason to avoid this, no worries in closing this as it's out of the blue. But since the change wasn't big I figured I'd give it a shot.

tpaviot commented 9 months ago

Thank you for this interesting contribution.

However, the thing is you still need to have the step file stored on your local filesystem, and I can't see the improvement compared to the standard Read method. It would be much better that the ReadFromString method only takes a string, let's say:

with open("test.stp", "r", encoding="uft8") as step_file:
    step_file_content = step_file.read()
step_reader.ReadFromString(step_file_content)

This way, the ReadFromString method could take any string, coming from instance from an API call.

anuraaga commented 9 months ago

Hi @tpaviot - I included the filename in ReadString to be consistent with ReadStream. I'm actually not sure what the filename is used for, I assumed it's something to help the STEP parser understand. But in neither case is the file actually present on the filesystem, which we confirm in the unit test.

Note ReadStream upstream doc is here

https://dev.opencascade.org/doc/refman/html/class_x_s_control___reader.html#a49a080413455e31484199c4bdf25e08e

It doesn't quite say what the purpose of the parameter is but seems to be required. One idea is to implement ReadString with a hard-coded filename instead of taking a parameter, fromstring.step or something but wasn't sure the implications of it. We would need some sort of file name since in the end we need to call ReadStream.

tpaviot commented 9 months ago

Good point ! If the filename is not necessary, let's just remove it (pass an empty string as filename), we don't need it at the python level (I have to check however why the filename is a mandatory parameter).

FYI I dont modify swig files directly in the wrapper directory, I use to work with the pythonocc-generator:

tpaviot commented 9 months ago

Using an empty string seems to work, but there maybe some side effects I can not see so far.

tpaviot commented 9 months ago

I modified the example https://github.com/tpaviot/pythonocc-demos/blob/master/examples/core_load_step_ap203_ocaf.py so that it can taka a string, that seems to work the same way as using a file.

Maybe there would be a more explicit solution than just passing an empty string as the stream name. Or leave this parameter with a default value, for instance:

def ReadFromString(the_string, the_stream_name="pythonocc_stream_name"):

@anuraaga your opinion?

tpaviot commented 9 months ago

Reverse engineering of the ReadStream method of STEPCAFControl_Reader:

It simply calls the ReadStream method of the myReader instance with type STEPControl_Reader

The stream name parameter is used at line https://github.com/Open-Cascade-SAS/OCCT/blob/master/src/STEPControl/STEPControl_Reader.cxx#L201 :

WS()->SetLoadedFile(theName);

The WS instance is a XSControl_WorkSession

The stream is read by calling the method ReadStream of the aLibrary instance (line 187) with type IFSelect_WorkLibrary defined here: https://github.com/Open-Cascade-SAS/OCCT/blob/master/src/IFSelect/IFSelect_WorkLibrary.cxx#L91

It just returns 1. Damned, dead end.

 void SetLoadedFile (const Standard_CString theFileName)
  { theloaded = theFileName; }

  //! Returns the filename used to load current model
  //! empty if unknown
  Standard_CString LoadedFile() const
  { return theloaded.ToCString(); }

Conclusion I guess we can safely use an hardcoded string for the stream name parameter. We can wrap it as an optional parameter in python, if ever someone needs to set it to a different value.

anuraaga commented 9 months ago

Thanks for confirming! In that case I lean towards not exposing a parameter, and if for some reason someone asks for it it could be added with a default value to be backwards compatible (I think). I somehow expect for no one to ever ask for it.

pythonocc-generator

Sorry I hadn't noticed this repo. Should I move this change to it? And also

tpaviot commented 9 months ago

I've been working on a more robust solution : wrap all std::istream and std::ostream parameters in occt to python strings. This leads to a simpler generator, and a python wrapper more aligned with occt method names and signature. It maybe less pythonic, but easier to go from the official documentation to python. PR is on the way

tpaviot commented 9 months ago

@anuraaga see #1301

anuraaga commented 9 months ago

Thanks! That looks very useful

hunosdof commented 2 weeks ago

I am trying to load a step file from a stream instead directly from a file.

However, the following function fails:

status = step_reader.ReadStream(file_path, file_content)

I am passing this file_content: if isinstance(file_content, bytes): file_content = file_content.decode('utf-8')

It is in utf-8 format. However, I do not get an error but my code fails in this line.

I assume that the file_content argument is incorrect (but str is expected). Is there a specific format this string has to be in?

Thank you!