hyperspy / hyperspy_swift_library

HyperSpy Nion Swift Library reader
0 stars 3 forks source link

New nionswift version broke library #3

Open ltizei opened 4 years ago

ltizei commented 4 years ago

The function _migrate_library() does not seem to exist in the nionswift version 0.15. Chris Meyer, the main developer of nionswift, has been made aware of this fact and said he will work towards a solution in the next release to keep hyperspy_swift_library working.

While waiting a solution, the library only works up to version 0.14.8 of nionswift.

cmeyer commented 4 years ago

Here is some example code that could be used to get this project running again. It requires nionswift 0.15.0 or later. It will take some further integration into this project to make it work, but it should not be too difficult. I decided to provide it here instead of a PR so that someone who knows how to use/test the final product can integrate it.

NOTE: It will only load Swift 14 and 15 projects. It will probably work with projects from earlier versions.

CAUTION: This code is a hack; it is likely to break in future versions of Swift. The permanent solution will be available when nion-software/nionswift#539 is implemented.

from nion.swift.model import Persistence
from nion.swift.model import Profile
from nion.swift.model import Project
import pathlib
import typing

# hack code to read project. works with Swift 14 or 15 folder or index projects.
def read_project(file_path: typing.Union[str, pathlib.Path]) -> Project.Project:
    file_path = pathlib.Path(file_path)
    if file_path.suffix == ".nsproj":
        r = Profile.IndexProjectReference()
        r.project_path = file_path
    else:
        r = Profile.FolderProjectReference()
        r.project_folder_path = file_path
    r.persistent_object_context = Persistence.PersistentObjectContext()
    r.load_project(None)
    r.project._raw_properties["version"] = 3
    r.project.read_project()
    return r.project

def print_data_item_sizes(file_path: typing.Union[str, pathlib.Path]) -> None:
    p = read_project(file_path)
    for data_item in p.data_items:
        print(f"{data_item.title}: {data_item.xdata.data.shape}")
ltizei commented 4 years ago

@cmeyer There might be a smarter way that I am missing for this, but currently the dictionary of data_item.properties is not compatible with a Pandas dataframe.

md = reader.project.data_items[0].properties
df = pd.DataFrame(md)

with output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-6b6bc25bae08> in <module>
      1 md = reader.project.data_items[0].properties
----> 2 df = pd.DataFrame(md)

~/miniconda3/envs/nion-test/lib/python3.8/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    466 
    467         elif isinstance(data, dict):
--> 468             mgr = init_dict(data, index, columns, dtype=dtype)
    469         elif isinstance(data, ma.MaskedArray):
    470             import numpy.ma.mrecords as mrecords

~/miniconda3/envs/nion-test/lib/python3.8/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
    281             arr if not is_datetime64tz_dtype(arr) else arr.copy() for arr in arrays
    282         ]
--> 283     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    284 
    285 

~/miniconda3/envs/nion-test/lib/python3.8/site-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype, verify_integrity)
     76         # figure out the index, if necessary
     77         if index is None:
---> 78             index = extract_index(arrays)
     79         else:
     80             index = ensure_index(index)

~/miniconda3/envs/nion-test/lib/python3.8/site-packages/pandas/core/internals/construction.py in extract_index(data)
    398 
    399             if have_dicts:
--> 400                 raise ValueError(
    401                     "Mixing dicts with non-Series may lead to ambiguous ordering."
    402                 )

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

Is there a plan for the for the future structure of data_item.properties? Will it be the same as the current version. The current structure needs some extra steps to get into an Pandas data frame format, because of the key 'metadata' contains a dictionary. Could the content of properties['metadata'] be the content of md['metadata']['hardware_source']['autostem'] or is there a reason for this formatting that should be taken into account?

ltizei commented 4 years ago

@cmeyer @francisco-dlp In fact, it was a mistake by my part. I managed to make something that works similarly as for version 0.1 of the reader with the old Nionswift libraries. There is one final point that is not clear to me. The new version cannot read the old libraries, as the reader itself has changed. Should a new version of hyperspy_swift_library be released, so 0.1 takes care of the old formats and the new one from nionswfit=0.14 onwards?

ltizei commented 4 years ago

@cmeyer I guess I understood the reader should work for the 0.14 version libraries, but I couldn't get it to work. I put an example in the branch nions15update of my fork. I can load the new project (Nion Swift Project 20201002.nsproj) but not the old library (TrialData/Nion Swift Library 20200606/). Am I doing something wrong?

francisco-dlp commented 4 years ago

@ltizei, it'll be easier to help you with this if you send a PR to this repository.

cmeyer commented 4 years ago

I don't know exactly the intended behavior of this plug-in, but as for the Swift side of things, the data_item.properties is an internal method and there is no guarantee about future compatibility or even whether it will continue to be available.

You probably want to use data_item.metadata instead, which is part of the long term API, although this package is still using the internal DataItem object rather than the public API DataItem, so some more work will need to be done after nion-software/nionswift#539 is implemented.

And from what I understand, you might need to explicitly construct a pandas Series or DataFrame from the dict. I'm only starting to use pandas - but my intention is to have support for it directly in Swift (see nion-software/nionswift/issues/433). So I'd like to understand more about how you're using it eventually.

pandas.DataFrame(pandas.Series({"abc": 3, "def": 4, "efg": {"hij": 6}}))

ltizei commented 4 years ago

The behavior I am looking is to parse easily the data_items to select somes based on conditions. For example, if I am looking for all spectrum images (collection_dimension_count == 2), with CL (title contains 'CL'), spectra (datum_dimension_count==1), I would do this:

cond1 = df['datum_dimension_count']==1 
cond2 = df['collection_dimension_count']==2
cond3 = ['spim' in name for name in df['title']]
cond4 = ['CL' in name for name in df['title']]
df[cond1&cond2&cond3&cond4]['title']

And I would get:

44    Square1_NP1_CL_spim1_grat150_cent580nm_20ms
47    Square1_NP1_CL_spim2_grat150_cent580nm_20ms
50    Square1_NP1_CL_spim3_grat150_cent580nm_10ms
53    Square1_NP1_CL_spim4_grat150_cent580nm_10ms
56    Square1_NP1_CL_spim5_grat150_cent580nm_20ms
59    Square1_NP1_CL_spim6_grat150_cent580nm_20ms
62    Square1_NP1_CL_spim7_grat150_cent580nm_20ms
65    Square2_NP3_CL_spim1_grat150_cent580nm_20ms
68    Square2_NP3_CL_spim2_grat150_cent580nm_20ms
71    Square2_NP3_CL_spim3_grat150_cent580nm_20ms
74    Square2_NP3_CL_spim4_grat150_cent580nm_20ms
77    Square2_NP3_CL_spim5_grat150_cent580nm_20ms
80    Square2_NP3_CL_spim6_grat150_cent580nm_20ms
83    Square2_NP3_CL_spim7_grat150_cent580nm_20ms
86    Square2_NP3_CL_spim8_grat150_cent580nm_20ms

This makes automating tasks on parts of the whole dataset very easy.

TomaSusi commented 2 years ago

Just came across this project and wanted to use it but...