hyperspy / rosettasciio

Python library for reading and writing scientific data format
https://hyperspy.org/rosettasciio
GNU General Public License v3.0
50 stars 28 forks source link

SFS Reader Origins and license. #36

Open imikejackson opened 2 years ago

imikejackson commented 2 years ago

Describe the bug

Could you provide background information as to how the SFS Reader was developed? The SFS Format is a commercial product from AidAim software? Did AidAim license the format to HyperSpy for GPLv3 use under Python?

To Reproduce

Steps to reproduce the behavior:

Minimum working example of code

Expected behavior

A clear and concise description of what you expected to happen.

Python environement:

Additional context

We would like to start using HyperSpy and add more features to the Bruker.py file (EBSD data extraction specifically) but we want to be sure of the correct licensing issues. I looked back through the git commit history and the bruker.py file (after following file renames) just comes in one commit with no other comments.

jat255 commented 2 years ago

@imikejackson I'm not overly familiar with the development of that plugin, but I see the "original original" commit was here: https://github.com/hyperspy/hyperspy/commit/c198833d7b999a48835508b4c278762afc52cf42, which includes some comments:

#####################################################
#Reverse engineering of this file format should be considered as FAIR USE:
#  The main reason behind RE is interoperability of Bruker composite files (*.bcf,
#   *.pan saved with Esprit software) with open source packages/software allowing
#   much more advanced data manipulation and analysis.
#
#####################################################
### SFS File format ###
#SFS is proprietary single file system developed by AidAim Software(tm)
# natively used with Delphi(tm) draconically over-expensive and useless languages.
# This function tries to implement minimal reading capabilities based on
#  reverse engineering (RE), and can become useless with future format if mentioned
#  developers would introduce critical changes.
#
# This library is very basic and can read just un-encrypted type of sfs.
# At least two compression methods are used for compression zlib and bzip2.
# However sfs containers could use different compression methods which
#  is unknown for developer of this library, thus just zlib and bzip2
#  decompressions are implemented

@sem-geologist implemented this reader, and could probably describe more how it was developed, but it appears from the comments that it was reverse-engineered (although I'm not sure from what sources; perhaps just trial and error).

sem-geologist commented 2 years ago

It was reverse engineered from nothing. No documentation, no AidAim help, nothing. I got to know that it is that technology from few exposed facts: Esprit has scripting in pascal/Delphi, and bcf and pan file headers has this file signature which after googling up I could find out it is developed by AidAim (which creates solutions in Delphi). They have listing of functionality as advertisement which their format implements, and thus I knew what to look for more or less in the format. particle analysis *.pan files were in particularly helpful as it creates massive file tables.

As you can see in above origin origin (damn this git saves a bit too much of history :)) there was quite a furious frustration, these sfs does not solve any modern problems and is such a waste of computing cycles. simple Zip as a container would had worked more efficiently, but for some reasons Bruker used that sfs technology. Some initially implemented stuff was not possible to test thus later was removed (i.e. bz2 compression).

So i had licensed then it as Gpl3 as Hyperspy had same license. I should also mention that I am in some progress reimplementing the reader directly for NeXLSpectrum.jl which is going to be unlicensed.

ericpre commented 2 years ago

Transfered to hyperspy/rosettasciio as the IO development will continue here.

sem-geologist commented 2 years ago

@imikejackson , as Your question got answered here, would You mind closing this issue? BTW, the RE of that format took me few years, but that was worth as it opened the format and is possible to offload processing of bcf's offline to servers with any OS. If You would want any help with EBSD RE I could share hints and mentor if need that, I had looked initially into that (our lab has Brukers EBSD), the raw data is packed very similarly compared to binary data of EDS cube, and Esprit interpreted data is very easy to get as those are xml. I however have no more time for doing anything with EBSD, I work full-time with EPMA, thus had never implemented the reader.

vasole commented 2 years ago

Concerning the license of the code, I doubt this project can legally change it without the agreement of the original author (that means @sem-geologist)

sem-geologist commented 2 years ago

@vasole what project Are You talking about?

vasole commented 2 years ago

This one. My comment is related to #51 Unless you made a official copyright transfer, if you released the code under GPL, only you can change the license.

sem-geologist commented 2 years ago

why so pessimistic? It was already discussed before at HyperSpy, and now as io library is going to be split from HyperSpy I think it is going even easier to change that license (At least I would agree it to be re-licensed to LGPL or BSD). Are Your project in particularly relying on python solution?

vasole commented 2 years ago

It's not about being pessimistic. We changed the license of the fabIO project and we had to contact and get the agreement of all the contributors to do so. So, things can be cumbersome.

My project is the XRF analysis code PyMca (https://github.com/vasole/pymca), and what some of my users need is SFS support. That is the short term need.

For the long run, I guess not having to maintain software support of multiple file formats in my code and being able to rely on (and contribute to) a widely maintained library can only be of benefit to everybody.