Open fkromer opened 3 years ago
That autowrap library looks quite nice. We don't really have an internal need for this but we're certainly open to a PR that introduces the capability. And if someone adds it I will maintain it as the pwiz API changes. I would help with the Boost.Build aspect and integrate it into our CI testing.
I think there's already some existing SWIG wrapper for one of the simplest, C-style interfaces to pwiz (pwizRampAdapter) but it doesn't get maintained and I'm not sure anybody's using it. I don't maintain it because the pwizRampAdapter is too simple and doesn't model mzML properly.
That autowrap library looks quite nice. We don't really have an internal need for this but we're certainly open to a PR that introduces the capability. And if someone adds it I will maintain it as the pwiz API changes. I would help with the Boost.Build aspect and integrate it into our CI testing.
Thanks a lot for your support offer. I'll clarify task priorities internally and let you know if and how much effort I can put into this.
I think there's already some existing SWIG wrapper for one of the simplest, C-style interfaces to pwiz (pwizRampAdapter) but it doesn't get maintained and I'm not sure anybody's using it. I don't maintain it because the pwizRampAdapter is too simple and doesn't model mzML properly.
Would be great if you could give me a concret reference that I can have a look into it.
By concrete reference do you mean the SWIG bindings I referred to you, or the C++ classes I'd like bindings for?
I suggest you have a look at autowrap and you can talk to me or @uweschmitt on what it actually does - it is basically a way to auto-generate Cyhton code. Since we developed autwrap there have also been some developments like pybind11 that look pretty interesting.
There's also Boost.Python, but it's been a long time since I looked at python bindings. My main concern with bindings is low maintenance and a preference that the OOP style of the pwiz MSData library not be reduced to a lowest-common-denominator C-style structs and procedures. My brief glance at autowrap seemed like it would handle that (including boost::shared_ptr which we use very often in MSData). Do you want to glance at our MSData.hpp header and tell me whether autowrap would be a good fit? It's basically just a C++ representation of the mzML data model.
Of course, it would also be nice to have vendor reader support. I'm not sure how that would work with Cython.
I think there would be strong interest in this, I think the R bindings enjoy quite a bit of popularity. Boost.Python is doable but its a lot of manual work - I have done that once and I can not recommend it. Basically you have to translate every single data structure manually in C++ code which is not fun. The way autowrap works is that it basically generates a Python object with a single member, a Boost shared_ptr which points to the C++ object. This solves problems with memory management since that is done in C++ and once no more Python object points to the C++ objects it can safely get deleted.
Of course, it would also be nice to have vendor reader support. I'm not sure how that would work with Cython.
I think as long as the API calls in pwiz are there and the dlls are distributed it should actually work.
@chambm
By concrete reference do you mean the SWIG bindings I referred to you, or the C++ classes I'd like bindings for?
Both would be great.
My main concern with bindings is low maintenance and a preference that the OOP style of the pwiz MSData library not be reduced to a lowest-common-denominator C-style structs and procedures.
I fully agree.
@hroest
Of course, it would also be nice to have vendor reader support. I'm not sure how that would work with Cython.
I think as long as the API calls in pwiz are there and the dlls are distributed it should actually work.
Definitelly. I'm used to work with Linux runtime environments and not with Windows runtime environments. However I could imagine that it's not straightforward to support vendor software components in a Python wrapper.
As far as I know it's impossible to run Windows DLLs in Linux environments without some kind of abstraction. That's independent of the fact if DLLs do depend on runtime environment components installed on the runtime OS like e.g. C# or C++ runtime frameworks or not. The Python wrapper would not work on Linux runtime environments of course. Do different C#/C++ runtime framework versions installed on the same Windows system potentially conflict? In this case one would have to state the runtime framework dependencies explicitly and carefully. Otherwise you can run into major issues if you want to run some other software on your Windows runtime environment which require different C#/C++ framework versions.
If vendor components would be available as plain C++ or plain C# one could add support for Linux environments probably (using the mono project in case of C#).
For the C++ class, see the MSData.hpp file I linked to. Here's the SWIG bindings: https://github.com/ProteoWizard/pwiz/tree/master/pwiz/utility/bindings/SWIG
As long the vendor support works with Python on Windows, it should probably work in the pwiz docker container via wine.
For the C++ class, see the MSData.hpp file I linked to. Here's the SWIG bindings: https://github.com/ProteoWizard/pwiz/tree/master/pwiz/utility/bindings/SWIG
Thanks for the hints.
As long the vendor support works with Python on Windows, it should probably work in the pwiz docker container via wine.
Yeah, right.
Do different C#/C++ runtime framework versions installed on the same Windows system potentially conflict? In this case one would have to state the runtime framework dependencies explicitly and carefully. Otherwise you can run into major issues if you want to run some other software on your Windows runtime environment which require different C#/C++ framework versions
On Windows, the runtime version (ergo the version of the platform libraries you get) is usually tied to the MSVC compiler version you use. This is why Windows users had to install the appropriate MSVC redistributable C++ runtimes (the "CRT") in order to get C/C++ binaries to work. Around Windows 8 or 10, they transitioned to what they called "universal CRT" or UCRT which is an operating system feature, installed with the standard OS update mechanism. The .NET runtime is also selected by the compiler and on Windows installed as an OS feature or update. So for most new machines, this isn't an issue anymore unless you're using a really old program with extra features (like Py2.7+OpenMP). Naturally, running something built with the UCRT on a machine predating it means you need to ship the UCRT libraries with your executable.
If you're going to have Python link with a Windows executable running under Wine on a *nix, you'll need to be using a Windows-compiled Python too, correct?
On Windows, the runtime version (ergo the version of the platform libraries you get) is usually tied to the MSVC compiler version you use. This is why Windows users had to install the appropriate MSVC redistributable C++ runtimes (the "CRT") in order to get C/C++ binaries to work. Around Windows 8 or 10, they transitioned to what they called "universal CRT" or UCRT which is an operating system feature, installed with the standard OS update mechanism. The .NET runtime is also selected by the compiler and on Windows installed as an OS feature or update. So for most new machines, this isn't an issue anymore unless you're using a really old program with extra features (like Py2.7+OpenMP). Naturally, running something built with the UCRT on a machine predating it means you need to ship the UCRT libraries with your executable.
Interesting.
If you're going to have Python link with a Windows executable running under Wine on a *nix, you'll need to be using a Windows-compiled Python too, correct?
Reading about Wine has not brought me any further so far.
You mostly don't need to read or worry about Wine. Just make it work on Windows, and Wine will allow it to work on Linux by magic.
looking for such wrapper lib as well
A hack I've been using (a bit ugly but working fine) is to spawn the pwiz/msconvert docker image from Python (requires docker though).
import docker
running_container = docker.DockerClient().containers.run(
"chambm/pwiz-skyline-i-agree-to-the-vendor-licenses",
"wine msconvert YOUR_FILE_OR_FOLDER_FROM_DOCKER_MOUNT --filter --arguments --things"
volumes={
f"{LOCAL_FOLDER}": {"bind": "/data", "mode": "ro"},
f"{LOCAL_OUTPUT}": {"bind": "/out_data", "mode": "rw"},
},
stdout=True, stderr=True, stream=True, detach=True, auto_remove=True,
remove=True, # To unmount the volume for the next run.
)
for log in running_container.logs(stream=True, stdout=True, stderr=True, timestamps=True)):
logger.debug(log) # or print
Hope it helps someone.
I wondered if you guys would be interested in wrapping the pwiz functionality into a Python wrapper. Another chem lib OpenMS uses an utility lib
autowrap
to make this process as painless as possible to provide pyopenms and to keep it in sync with OpenMS. Howerver I'm not sure if thepwiz
implementation satisfies the requriements ofautowrap
to work properly. In any case it would be very valuable to beeing able to usepwiz
functionality in the major data science language Python.