mjredmond commented 6 years ago

It looks like the op2 reader reads the full file into memory. I was planning on using the export_to_hdf5 method to convert to a temporary h5 format first, then convert to the h5Nastran format. I often work with op2 files that are 30+ gigs, so I'm running into memory limitations.

Would it be difficult to convert to hdf5 as the op2 is read instead of reading it all into memory first?

Thanks.

SteveDoyle2 commented 6 years ago

Would it be difficult to convert to hdf5 as the op2 is read instead of reading it all into memory first?

That's definitely on the table of things to do. I have MSC Nastran 2005, so I don't have access to creating MSC's HDF5 files (outside of h5Nastran). As of this week, I now have access to NX 12, which also supports HDF5, but I haven't run through their latest set of example problems. My thought had been to wait on re-architecting the code to directly put the data into HDF5 format until you were happy with the structure (e.g., had done various tests) and then just copy it. That's probably close enough now.

The biggest question of HDF5 that I have is, can I just load say a 4 GB file into memory and use the HDF5 structure as an API? I'd like to not write out a file if I don't need to.

I figure we should also standardize on a single HDF5 package. H5Nastran uses pytables, while the OP2-HDF5 exporter uses h5py. I'm not really familiar with the differences in using it, so I don't have a huge preference.

Regarding memory usage though, it should be easy to reduce that (depending on your data). You can limit results by subcase and result quantity The default is to read everything.

model = read_op2(op2_filename)  # reads everything
model = read_op2(op2_filename, subcases=[1])  # reads only subcase 1
model = read_op2(op2_filename , subcases=[1], exclude_results=['stress'])

reads only subcase 1 stresses

model = read_op2(op2_filename , subcases=[1],

exclude_results=['rod_stress']) # reads only subcase 1 CROD stresses model = read_op2(op2_filename, include_results=['displacement'])

The result names are checked, so you can't screw up your include/exclude list. Some of the exclude statements get ignored (and are just included), but for the most part are respected. Just let me know if you run into one.

On Tue, Jan 30, 2018 at 11:55 AM, mjredmond notifications@github.com wrote:

It looks like the op2 reader reads the full file into memory. I was planning on using the export_to_hdf5 method to convert to a temporary h5 format first, then convert to the h5Nastran format. I often work with op2 files that are 30+ gigs, so I'm running into memory limitations.

Would it be difficult to convert to hdf5 as the op2 is read instead of reading it all into memory first?

Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/474, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAqWTpQU5yXpqBWKtl-iIe8WFvtMJ0Jks5tP3OjgaJpZM4Ry6sv .

mjredmond commented 6 years ago

My thought had been to wait on re-architecting the code to directly put the data into HDF5 format until you were happy with the structure (e.g., had done various tests) and then just copy it. That's probably close enough now.

So you would directly create a h5Nastran file from the op2? If that's the case, the op2 reader could be modeled similar to the h5Nastran punch file reader... read a table/matrix/whatever gather enough meta data to describe the data and then "toss it over the fence"... then whatever catches that data can handle it appropriately.

As of this week, I now have access to NX 12, which also supports HDF5, but I haven't run through their latest set of example problems.

I didn't know that. Is it similar to MSC's format? Is there documentation somewhere?

The biggest question of HDF5 that I have is, can I just load say a 4 GB file into memory and use the HDF5 structure as an API? I'd like to not write out a file if I don't need to.

Can you elaborate? I don't quite follow.

I figure we should also standardize on a single HDF5 package. H5Nastran uses pytables, while the OP2-HDF5 exporter uses h5py. I'm not really familiar with the differences in using it, so I don't have a huge preference.

I've only used pytables. In the future (not sure when), pytables will be built on top of h5py, in which case it wouldn't matter which library is used. For now, I'd prefer pytables since it seems to work pretty well for what I'm doing. However, I think pytables can ready h5py created hdf5 files with no or little issues as it currently stands.

Thanks.

SteveDoyle2 commented 6 years ago

I didn't know that. Is it similar to MSC's format? Is there documentation somewhere?

I'm not sure. I read some of the release notes on NX 12 about 10 minutes ago :).

https://docs.plm.automation.siemens.com/data_services/resources/nxnastran/12/help/tdoc/en_US/pdf/release_guide.pdf

It's probably in the DMAP guide is my guess.

pytables vs h5py

They're two separate packages, so I'm just trying to make the code have fewer dependencies. It doesn't really make sense to me to to have two separate packages to work with HDF5 data. I spent some time picking h5py (they claimed they worked better with numpy objects), but it's only ~100 lines of code for the entire op2 hdf5 exporter, so standardizing on h5py probably doesn't make the most sense.

I was also unaware of the merger, so that might change things.

On Tue, Jan 30, 2018 at 12:52 PM, mjredmond notifications@github.com wrote:

As of this week, I now have access to NX 12, which also supports HDF5, but I haven't run through their latest set of example problems.

I didn't know that. Is it similar to MSC's format? Is there documentation somewhere?

The biggest question of HDF5 that I have is, can I just load say a 4 GB file into memory and use the HDF5 structure as an API? I'd like to not write out a file if I don't need to.

Can you elaborate? I don't quite follow.

I figure we should also standardize on a single HDF5 package. H5Nastran uses pytables, while the OP2-HDF5 exporter uses h5py. I'm not really familiar with the differences in using it, so I don't have a huge preference.

I've only used pytables. In the future (not sure when), pytables will be built on top of h5py, in which case it wouldn't matter which library is used. For now, I'd prefer pytables since it seems to work pretty well for what I'm doing.

Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/474#issuecomment-361730349, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAqWcTDYq6bfGbSoAPwv18yOguYzV6rks5tP4ETgaJpZM4Ry6sv .

mjredmond commented 6 years ago

The biggest question of HDF5 that I have is, can I just load say a 4 GB file into memory and use the HDF5 structure as an API? I'd like to not write out a file if I don't need to.

Oh, you want to convert an op2 to h5Nastran, but not write out the h5Nastran file, just keep it in memory. It's currently not implemented but is possible, since you can have in memory h5 files with pytables. I think it would be fairly easy to do.

As far as the pytables/h5py merger, I haven't seen any progress in over a year, so I have no clue when that is happening. Unfortunately I don't think h5py can make tables like pytables can (if it does I have no clue and haven't see any examples).

Just made commit, now you can make in memory hdf5 files by passing in_memory=True to H5Nastran.

SteveDoyle2 commented 6 years ago

I saw this post from May of 2017, but nothing since. I

https://www.numfocus.org/numfocus-awards-small-development-grants-to- projects/

you want to convert an op2 to h5Nastran, but not write out the h5Nastran file, just keep it in memory.

Yes. Maybe a better way to say that. I see a few use cases:

use numpy, which I think would be faster/fewer requirements/nice for backwards compatibility
use hdf5 in memory to speed up testing and it's good for test models and frequently changing models (e.g., optimization)
use hdf5 from a file for very large models

On Wed, Jan 31, 2018 at 5:50 AM, mjredmond notifications@github.com wrote:

The biggest question of HDF5 that I have is, can I just load say a 4 GB file into memory and use the HDF5 structure as an API? I'd like to not write out a file if I don't need to.

Oh, you want to convert an op2 to h5Nastran, but not write out the h5Nastran file, just keep it in memory. It's currently not implemented but is possible, since you can have in memory h5 files with pytables. I think it would be fairly easy to do.

As far as the pytables/h5py merger, I haven't seen any progress in over a year, so I have no clue when that is happening. Unfortunately I don't think h5py can make tables like pytables can (if it does I have no clue and haven't see any examples).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/474#issuecomment-361937654, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAqWXDLgvbxNsuOO0jMsu1dnNGM14osks5tQG-zgaJpZM4Ry6sv .

dmap-guru commented 6 years ago

It looks like you guys don't have any documentation for HDF5 files output by MSC Nastran. In versions 2017 and 2018, the installation publishes the HDF5 schema at /msc20170/util/nh5rdb (replace 20170 with 20180 for V2018) in xml, html and C header file (.h) formats. Here you can see all the relations that the version supports. If you download the HDF5 viewer at hdfgroup.org (java) or vitables from vitables.org (python), you can read the .h5 file created by Nastran directly to see how the schema stores the data. The Release Guides 2016, 2017 and 2018 explain various aspects of the HDF5 file and the data Nastran writes to it. If you don't have access to these data, let me know and I can get them to you.

mjredmond commented 6 years ago

I have the MSC 2018 hdf5 xml files. I'd really like to have some 2018 h5 files created by Nastran... the example files they have are far too narrow in scope. I have a lot of TODO's because the hdf5 format stores a number instead of a string as defined in the QRG in many cases. If you have any example files that would be greatly appreciated. Thanks.

SteveDoyle2 commented 6 years ago

In terms of examples, the models/elements folder has tiny examples with static, modal, frequency, transient and buckling to test out a varirty of elements.

I assume those numbers are also consistent with the op2. If you have the bdf that made it, it's pretty easy to reverse engineer.

On Mon, Mar 12, 2018, 7:26 AM mjredmond notifications@github.com wrote:

I have the MSC 2018 hdf5 xml files. I'd really like to have some 2018 h5 files created by Nastran... the example files they have are far too narrow in scope. I have a lot of TODO's because the hdf5 format stores a number instead of a string as defined in the QRG in many cases. If you have any example files that would be greatly appreciated. Thanks.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/474#issuecomment-372327977, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAqWUEp887ekpWDa967q0lsmmx8Kb9Hks5tdoWqgaJpZM4Ry6sv .

SteveDoyle2 commented 6 years ago

I assume those numbers are also consistent with the op2.

To be clear, if you go to the card that you're interested in and look at how the OP2 reader handles it, you can just use that.

For example, in the CHBDYG class (pyNastran/bdf/thermal/thermal.py), you'll see add_op2_data:

@classmethod
def add_op2_data(cls, data, comment=''):
    """
    Adds a CHBDYG card from the OP2

    Parameters
    ----------
    data : List[varies]
        a list of fields defined in OP2 format
    comment : str; default=''
        a comment for the card
    """
    eid = data[0]
    surface_type = data[1]
    i_view_front = data[2]
    i_view_back = data[3]
    rad_mid_front = data[4]
    rad_mid_back = data[5]
    nodes = [datai for datai in data[6:14] if datai > 0]
    if surface_type == 3:
        surface_type = 'REV'
    elif surface_type == 4:
        surface_type = 'AREA3'
    elif surface_type == 5:
        surface_type = 'AREA4'
    #elif surface_type == 7: # ???
        #surface_type = 'AREA6'
    elif surface_type == 8:
        surface_type = 'AREA6'
    elif surface_type == 9:
        surface_type = 'AREA8'
    else:
        raise NotImplementedError('eid=%s surface_type=%r' % (eid, surface_type))

    assert surface_type in ['REV', 'AREA3', 'AREA4', 'AREA6', 'AREA8'], 'surface_type=%r data=%s' % (surface_type, data)
    return CHBDYG(eid, surface_type, nodes,
                  iview_front=i_view_front, ivew_back=i_view_back,
                  rad_mid_front=rad_mid_front, rad_mid_back=rad_mid_back,
                  comment=comment)

Some cards implement that sort of thing directly in the OP2 geometry reader (e.g., in *pyNastran/op2/geom/.py**, but it's going to be in one of the two places.

mjredmond commented 6 years ago

I didn't think of that. Thanks! Although I just noticed that at least for some of the cards I'm having trouble with don't have that method, for example PBUSH1D.

SteveDoyle2 commented 6 years ago

The dmap.pdf might also have some useful data, but usually not. The integers usually start at 1 and are in the same order as listed in the QRG, but not always.

If you really need the PBUSH1D, then I'd just make a small test case. Otherwise, it's a fairly uncommon card.

SteveDoyle2 commented 6 years ago

@mjredmond

I spent some time over the weekend. I've now got an OP2 option for loading the op2 data directly to an hdf5 file (currently automatically set to the op2_filename, but with an h5 extension).

model = OP2()
model.load_as_h5 = True
model.read_op2(op2_filename)

It's still in the pyNastran format, which may not be the best. It also only works on tables (e.g., displacement, velocity, spc forces, etc.) and real stress at the moment. It's fully compatible with the f06 writer, and exporting to the pyNastran-style h5 format.

Other than the data format, what I haven't figured out is how to close the file object once the OP2 model object goes out of scope.

mjredmond commented 6 years ago

That's a good step. The file object you're referring to is the h5 file object? You could add a close method to the OP2 object to close the underlying h5 file. It would just have to be documented that it is required to manually close when using the h5 format. The only other way I know of is using del but that could cause issues with garbage collection.

SteveDoyle2 commented 6 years ago

Yeah, I'm referring to the h5 file object. The dump file needs to stay open in order to access the data (e.g., write the f06), but definitely needs to be closed before say using the h5 file in another test. I would have thought the OP2 object should automatically go out of scope at the end of the test and thus the h5 file gets closed, but nope.

You could add a close method to the OP2 object to close the underlying h5 file.

That would work. I'm trying to avoid having to remember to close it when I'm using an h5 file, but not when I don't. Maybe it's finally time to add a context manager?

__del__ might work, but that's not something I use often. I'll give it a shot.

SteveDoyle2 commented 5 years ago

This isn't done for all result types, but I'm going to call it done for basic results (e.g., displacement/spc/mpc forces, stress). If there are additional results that are desired, it's pretty easy to add by doing the following in the build method:

    _times = zeros(self.ntimes, dtype=dtype)
    element = zeros(self.ntotal, dtype='int32')

    #[sd, sxc, sxd, sxe, sxf, axial, smax, smin, MS]
    data = zeros((self.ntimes, self.ntotal, 9), dtype='float32')

    if self.load_as_h5:
        group = self._get_result_group()
        self._times = group.create_dataset('_times', data=_times)
        self.element = group.create_dataset('element', data=element)
        self.data = group.create_dataset('data', data=data)
    else:
        self._times = _times
        self.element = element
        self.data = data

SteveDoyle2 / pyNastran

OP2: can't read large files, problem for hdf5 export #474

reads only subcase 1 stresses