POETSII / Orchestrator

The Orchestrator is the configuration and run-time management system for POETS platforms.
1 stars 1 forks source link

Support .gz encoded xml files (low priority) #210

Closed m8pple closed 2 years ago

m8pple commented 3 years ago

Most large pre-generated xml is gz encoded to save space and reduce loading time. For example, the 2019 benchmarks/compliance tests sent to ADB for stress-testing are all gz encoded. This typically reduces them to about 1/5 to 1/10 of their original size, and usually makes them faster to both generate and load, beyond just being smaller on disk. A lot of the existing tools are able to directly load .xml.gz files without a decompression step, which makes scaling experiments much easier.

Decompression comes for free if you use an existing XML parser, but it is probably not that difficult (maybe) to add to a custom XML parser. It looks xmlP::Parse takes a FILE *, so one approach would be to detect that the input has the extension .xml.gz and then use popen to stream in the data via gunzip.

Not the highest priority feature, but it adds to ergonomics quite a lot while dealing with large applications.

mvousden commented 3 years ago

Agree, with slight reservations of portability (I ain't setting that up on Windows!).

m8pple commented 3 years ago

Does anyone care about native windows viability? I use windows exclusively as my host OS, and have no problems maintaining a posix/linux environment. I realise there might be... outliers, but they could be educated.

(Not suggesting this is high priority (it isn't)).

mvousden commented 3 years ago

ADB (and possibly @heliosfa?). You're welcome to do the educating if you like :)

Generally speaking:

heliosfa commented 3 years ago

"life's too short to be educated".

Suggestion is to throw out a system call (to 7zip/winzip/whatever on Windows), though needs a little more thought

m8pple commented 2 years ago

Closing this, as it is possible to work around and causing noise in issues, and while trivial to do in posix it is hard in windows.