xoreos / xoreos-tools

Tools to help the development of xoreos
https://xoreos.org/
GNU General Public License v3.0
66 stars 28 forks source link

FEATURE: Compile xoreos-tools as a DLL/library #58

Open lachjames opened 4 years ago

lachjames commented 4 years ago

Hi :)

I've been using xoreos-tools for several of my projects, and up until now I've been mainly doing so through subprocess calls and writing files to disk. For example, I use it in my Python-based dialog editor (which I admittedly could write in C++ but that would be a significant changeover as I use several Python libraries for the project) as well as a Unity-based KOTOR level editor (which is in C# and can't be ported to C++ without changing game engines). This subprocess-based solution is less than ideal, and on machines without an SSD it could even become a significant bottleneck (although on my relatively fast SSD-based machine it's been fine so far).

Rather than moving to C++ (which is difficult and even not really possible in some cases), it seems that having "xoreos-tools come to us" through a DLL (which I could then write bindings for in all relevant languages) would be a better solution.

I took a look through the codebase and it seems that a lot of the code for the executables is also written in a way conducive to compiling as a DLL. Minor refactoring would be required for some files which include necessary logic in the main function (for example, eff and keybif would need a slight refactor). I think the ideal starting point would be the ability to do anything you can do through running the .exe files through a call to the DLL instead. I'm sure that there are other interesting/useful things that your code does behind the scenes which might be useful for other projects too though, so this is also worth considering down the track.

Is making xoreos-tools available as a DLL something that you'd consider?

Thanks :)

Lachjames

DrMcCoy commented 4 years ago

Hmm, well, making this a library is on my TODO list, but on a more wider scale. Right now, xoreos, xoreos-tools and Phaethon contain a lot of duplicated, copied code, and it would be great of unify that into a library.

However, there's a few caltrops:

Essentially, I have this on my (non-dated) roadmap, but I don't think the time for that is now.

That said, I have a different suggestion for you to work around the disk I/O problem you describe. You don't need to write text-based output from the xoreos-tools onto disk, you can just let it write into stdout and capture the stream from your Python code. Should also work with binary output where the outputting classes don't need to seek, but not all the tools can write binary output to stdout yet, and I'm unsure about the portability there.

lachjames commented 4 years ago

Thanks for your detailed response! Your argument makes sense and I agree with it.

I am currently using the exact method you describe to capture stdout and save writing a second file to disk, but I still need to write the input file. Perhaps a good compromise would be to allow stdin to be used as input if no file is given? Then at least drive bottlenecks removed as much as they can be (with the exception of reading the .exe from disk in the first place I suppose, but that's not really avoidable).

This would make it possible to write relatively contained libraries in different languages hooking into xoreos-tools without the use of temporary files, which would be fantastic (although not as performant as a DLL, it would likely be close enough for most use cases).

DrMcCoy commented 4 years ago

(Moved the stdin thing into issue #59)

lachjames commented 4 years ago

Spitballing here: combined with #59, the ability to run the executables in some sort of "batch/input" mode where we keep the tool active and continually pass data into it via stdin (with some sort of termination marker signifying the end of a program) would mostly mitigate any need for a DLL (in my use cases, at least), and possibly be easier to work with as well. The main bottlenecks are running the .exe to begin with, and writing/reading from disk. Removing both of those would make integrating the tool with other code via a subprocess much more performant.

DrMcCoy commented 4 years ago

we keep the tool active and continually pass data into it via stdin (with some sort of termination marker signifying the end of a program)

Ooof, no, sorry, but that's not really how to do things. Chaining and separating inputs is not the job of a tool, that's something the caller (the shell, in most cases) does by invoking the tool multiple times.

Really, just reading from stdin and writing to stdout gets you there. "Keeping the tool active" would only remove the overhead of reading the program itself from disk again (and that's basically always cached by the OS anyway) and of executing the program (insignificant).

lachjames commented 4 years ago

Sure I understand, and readily confess that I'm not really very knowledgeable when it comes to how these things should work - my understanding is that on Linux the overhead for running a subprocess is low compared to Windows as well, so perhaps the problem is worse for me than it would be for you while testing. I'm honestly not sure whether Windows/Linux/etc caches subprocesses run multiple times in a short period of time (or even if this is implementation-dependent somehow) .