Oxford uncompressed .ebsp file load problem in Aztec V6.1

Tijmenvermeij commented 1 month ago

Hi,

We recently upgraded Aztec to V6.1, and it seems that now we cannot load uncompressed .ebsp files anymore using Kikuchipy. It gives the error that the .ebsp is compressed, but this is not the case (according to what we specify in Aztec software). Does anyone have any idea what the problem can be?

A go-around for us would be to export the patterns to .h5oina and load those, but that would double the amount of data that we use in the end...

Thanks, Tijmen

hakonanes commented 1 month ago

Hi @Tijmenvermeij,

Thanks for this report. Just to clarify, you can read uncompressed patterns written with AZtec v6.0, but not with 6.1?

The reason for why the reader claims that the patterns are compressed lies here in OxfordBinaryReader.get_single_pattern_header(offset):

https://github.com/pyxem/kikuchipy/blob/d5db481eb1bc13ecb3d5bd8691312fa57c8a7467/src/kikuchipy/io/plugins/oxford_binary.py#L438-L445

If the particular byte assumed to contain the boolean is anything else than 0, it equates to true.

Can you send me a small .ebsp file of uncompressed patterns written with this software version? The only way we can fix this is by reverse engineering.

Unfortuantely, Oxford Instruments doesn't publish specifications for the binary .ebsp files (they are meant for internal use only, I've been told). The specification of their public HDF5 format (H5OINA) provides the following information on what's new in H5OINA v6.0 (corresponding to AZtec v6.1):

Add support for Unity, including export of multidetector systems with Unity and an auxillary detector.

What they have changed in the binary format to accomodate this, I have no clue.

A go-around for us would be to export the patterns to .h5oina and load those, but that would double the amount of data that we use in the end...

This is a valid problem and something that could be brought up with the Oxford Instrument folks!

Tijmenvermeij commented 1 month ago

Hi,

We upgraded Aztec from V5.1 to V6.1(SP3). V5.1 patterns worked fine with Kikuchipy. See here a small .ebsp file, saved as uncompressed with V6.1: https://www.dropbox.com/scl/fi/4nm24gdddb92u3ve303uw/32014751-d469-4c74-88fc-af3797e6872a.ebsp?rlkey=j1iob6d6ng6wawfxgmg5zz9db&dl=1

Note that the patterns are pure noise.

If necessary, we can contact Oxford instruments and ask for clarifications.

Thanks! Tijmen

hakonanes commented 1 month ago

Can you give any metadata about the file (number of patterns, pattern rows and columns, data type uint8 or uint16)? The reader interprets there to be 401 patterns of shape (nrows, ncols) = (128, 156), but this just leaves 8 007 209 - (401 128 156) = 41 bytes for remaining information, which I'm 99% sure is too little. The .ebsp files I've seen have an 8-byte file version in the beginning, then each pattern's byte starting position, then each pattern pre-pended with a 16-byte header and sometimes appended with an 18-byte footer. I think the number of patterns, 401, is incorrect.

Tijmenvermeij commented 1 month ago

Hi,

See here the .h5oina file with the patterns included: https://www.dropbox.com/scl/fi/elp7deoyrz0r9l2clh6g3/Project-2-Specimen-1-Site-1-Map-Data-2.h5oina?rlkey=v1gpnhvjgjldtp8k1lhixzs2v&dl=1

There should be 400 patterns and they should be 8bit... The pattern size you mention seems to be correct.

Yimin-Zhu commented 1 month ago

Hi both, regarding the new AZtec .ebsp file, I got some clues from the EMsoft developer(Marc DeGraef): "starting from AZtec6 the number of bytes in each pattern header change from 16 to 42, and there is a 25 offsets to the pattern" Hope this helps.

Tijmenvermeij commented 1 month ago

Thanks @Yimin-Zhu

I tried to load patterns from .hoina, which works, but only when patterns are saved "processed" (8bit, incl BG correction). When saving as "unprocessed", the .h5oina doesn't seem to load properly in Kikuchipy. Not sure what the problem is; the memory starts filling up, indicating that "Lazy" import does not work.

So I will start to have a look at loading of the .ebsp files myself. @hakonanes please let me know if you already made any progress... I'll do the same.

Thanks! Tijmen

CiosG commented 1 month ago

I'm not sure if it helps anyhow...

When "unprocessed" patterns are stored Aztec stores both "processed" and "unprocessed" in project folders, there should be .ebsp and .uebsp files both have the same name but different extension for the same dataset. Aztec is using only 8-bit processed for indexing.

Once you export to .h5oina both "processed" and "unprocessed" patterns are exported to it (probably starting from Aztec 6.1). If you don't want to export unprocessed I guess you can temporarily rename .uebsp file by adding one character.

"Unprocessed Static Background" can be found in Data/Header while unprocessed patterns are stored next to processed ones in .h5oina (and are readable by CrossCourt at least v.4.6.6.0)

I asked Pat Trimby when he was still the EBSD product manager at OINA that I want to export my data to .h5oina delete .ebsp files and be able to read .h5oina back in Aztec to do, for example, Hough re-indexing. Or the other way - to do Hough reindexing in AztecCrystal from .h5oina just to reduce the amount of data stored. Accepting, of course, that it may work slower than in Aztec from .ebsp. He is not in OINA anymore and I don't think they wanted to do this since MapSweeper was released.

Kind regards, Grzegorz

W dniu 2024-10-10 16:56, Tijmen Vermeij napisał(a):

Thanks @Yimin-Zhu [1]

I tried to load patterns from .hoina, which works, but only when patterns are saved "processed" (8bit, incl BG correction). When saving as "unprocessed", the .h5oina doesn't seem to load properly in Kikuchipy. Not sure what the problem is; the memory starts filling up, indicating that "Lazy" import does not work.

So I will start to have a look at loading of the .ebsp files myself. @hakonanes [2] please let me know if you already made any progress... I'll do the same.

Thanks! Tijmen

-- Reply to this email directly, view it on GitHub [3], or unsubscribe [4]. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Links:

[1] https://github.com/Yimin-Zhu [2] https://github.com/hakonanes [3] https://github.com/pyxem/kikuchipy/issues/690#issuecomment-2405362120 [4] https://github.com/notifications/unsubscribe-auth/AI3HGBVPEASAWCQHNQDDWBDZ22IS5AVCNFSM6AAAAABPL6WGSOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBVGM3DEMJSGA

-- Cios Grzegorz Academic Centre for Materials and Nanotechnology (ACMiN) AGH University of Science and Technology, Krakow, Poland 30 Mickiewicza, 30-059 Krakow bldg. D-16 (Kawiory 30, 30-055 Krakow), room 2.23 tel: +48 12 617 52 78

Tijmenvermeij commented 1 month ago

Thanks for the information, Grzegorz!

Yeah I'm quite annoyed by Oxford's data management. Basically I'm storing 3 times as much data as actually needed, which adds up quite fast with some of our larger scans... Pat Trimby leaving doesn't help I guess, but perhaps we need to start bothering Mark Coleman or someone else about this :)

I'm not sure why Kikuchipy has issues importing the .h5oina files with unprocessed patterns included. The name of the processed dataset in the h5 structure seems to be the same... I'll run some more trials on a small dataset.

Tijmenvermeij commented 1 month ago

So about importing .h5oina data in Kikuchipy, it seems that the Dataname 'Processed Patterns' is still valid. But once there are also Unprocessed Patterns stored in the h5, it seems to me that the Kikuchipy reader starts to load these into memory, even when Lazy=True. Does the reader by default import all other data into memory, accept for the 'Processed Patterns', when Lazy=True?

Tijmenvermeij commented 1 month ago

So about importing .h5oina data in Kikuchipy, it seems that the Dataname 'Processed Patterns' is still valid. But once there are also Unprocessed Patterns stored in the h5, it seems to me that the Kikuchipy reader starts to load these into memory, even when Lazy=True. Does the reader by default import all other data into memory, accept for the 'Processed Patterns', when Lazy=True?

Sorry for the spam, but I solved this issue, at least temporarily. I found the line that specifies that the pattern dataset should not be read into memory (I think), and added "Unprocessed Patterns" to it. I changed line 99 in oxford_h5ebsd.py to the following: dd = _hdf5group2dict(group["EBSD/Data"], data_dset_names=[self.patterns_name, "Unprocessed Patterns"]) This solves the issue for me. Perhaps this is not a suitable permanent solution though. I guess the user would need to have the option to import unprocessed patterns, if they like, instead of processed patterns...

hakonanes commented 1 month ago

I got some clues from the EMsoft developer(Marc DeGraef): "starting from AZtec6 the number of bytes in each pattern header change from 16 to 42, and there is a 25 offsets to the pattern"

Thank you for sharing, @Yimin-Zhu, this may be exactly the information we need.

@Tijmenvermeij, I opened #692 to track your issue with lazy loading of H5OINA files. Thank you for reporting this. I suggest to continue that discussion there.

@CiosG, thank you for bringing your knowledge of AZtec's workings to this issue. I've opened #693 to address the *.uebsp extension (unknown to me!).

hakonanes commented 1 week ago

@Tijmenvermeij, can you confirm that this is how the first pattern in your small test dataset (.ebsp and .h5oina) should look like?

first_pattern

hakonanes commented 1 week ago

The new pattern header in Oxford Instrument's *.ebsp files with version 6 (possibly also 5) is:

int32 (map x)
int32 (map y)
int32 (is_compressed)
int32 (n pattern rows)
int32 (n pattern columns)
int32 (n pattern bytes)

The footer stays the same, with beam (x, y), which stores the same information as map (x, y) scaled by the step size.

metadata

@marcdegraef, thanks for pointing the extra bytes out to @Yimin-Zhu, who pointed it out to us here.

hakonanes commented 1 week ago

@Tijmenvermeij, I made a fix in https://github.com/hakonanes/kikuchipy/tree/690-oxford-ebsp-aztec-6.1, could you try it out? python -m pip install 'kikuchipy@git+https://github.com/hakonanes/kikuchipy@tree/690-oxford-ebsp-aztec-6.1

Tijmenvermeij commented 1 week ago

@Tijmenvermeij, can you confirm that this is how the first pattern in your small test dataset (.ebsp and .h5oina) should look like?

Hi, this seems correct!

hakonanes commented 1 week ago

Thanks for confirming! Then I'll go ahead with the patch release.

hakonanes commented 1 week ago

Fixed in #700, will be part of a patch release 0.11.1 soon.

pyxem / kikuchipy

Oxford uncompressed .ebsp file load problem in Aztec V6.1 #690

Links: