libstorage / libstoragemgmt

A library for storage management
https://libstorage.github.io/libstoragemgmt-doc/
GNU Lesser General Public License v2.1
84 stars 32 forks source link

Feature request: Add enclosure support #211

Open BlaineEXE opened 8 years ago

BlaineEXE commented 8 years ago

Some feedback we've gotten from one of our test labs is that we are not able to monitor the status of JBODs with lsm. If we use an lsm-only approach to management, the only way to detect JBOD health issues is to view the health LED on the physical chassis.

I've already completed about 35% of the work needed to enable this feature, but I haven't been able to work on it in a while. It would be good to develop some requirements around what details need to be included in an enclosure moving forward.

Desired properties:

cathay4t commented 8 years ago

@BlaineEXE I hope you mean adding more library level local APIs instead of create a plugin named as SES or expanding hpsa plugin. From my point view, these features you are requesting could be done via SES standard or at least SCSI T10 standards.

May I have early look on your 35% work? I used to have SES plugin created but dropped.

BlaineEXE commented 8 years ago

The work I had done focused on using the HPSA plugin to get the information. Joe and I have discussed the SES method you mentioned, and it has merit, but I think a combination of both vendor-specific plugin and SES methods will be best. The vendor-specific plugin would be needed if one were to put a JBOD behind a RAID controller, for example. And the SES method would be preferable for a customer who knows that all controllers are going to be in HBA mode. I think this can be broken into two parts that can be developed simultaneously.

Part 1) Create the Enclosure data type/structure. The HPSA pluginadds data to Enclosure structures. In the plugin, SES commands can be called when appropriate from the HPSA plugin (i.e., for enclosures behind HBAs). The plugin can also give information about enclosures behind RAID mode controllers. One of the properties HPSSACLI returns about an enclosure is a SEP WWID, which we can use to determine a drive bay location with more precision.

Part 2) The SES functionality you have mentioned is still appropriate but does not add data into an Enclosure structure. For other plugins that do not return an enclosure's SEP WWID, SES functionality could deliver that information in the plugin.

There is still a lot of discussion and planning involved to make sure this is successful. It would be good to sketch out how everything should look and work.

My 35% work: https://github.com/BlaineEXE/libstoragemgmt/commits/add-enclosure-support

cathay4t commented 8 years ago

Thank you. Just as I expected:

I will take a look on your code and we could work out a plan then. But it has to wait a while.( I am taking PTO for next week(Oct 1-7).

cathay4t commented 8 years ago

@BlaineEXE I did a quick view on your code, it seems you are intending to create some plugin API like lsm.Client.enclosures(). all properties you are asking are in existing standards(SES and SPL), thus the feature you are requesting should be library level APIs like lsm.LocalDisk.enclosure_name_get('/dev/sda') or lsm.LocalDisk.enclosure_sn_get('/dev/sda'), instead of plugin level API.

Even playing with SES and SPL in C is fun for me, but apparently, above properties(especially enclosure status and port link speed) require numerous work, any good use case I could persuade my boss?

cathay4t commented 8 years ago

@BlaineEXE To simplify things, are you seeking implementations of these APIs(take python API as an example):

 * lsm.LocalDisk.enclosure_name_get("/dev/sda")
 * lsm.LocalDisk.enclosure_sn_get("/dev/sda")
 * lsm.LocalDisk.link_speed_get("/dev/sda")
    # Return ["6G", "6G"] if dual port, else ["6G"]
 * lsm.LocalDisk.sensors_get("/dev/sda")
    # Return [("<sensor_name_location>", "80C", STATUS_TEMP_WARN), ... ]
 * lsm.LocalDisk.fans_get("/dev/sda")
    # Return [("<fan_name_location>",
                <cur_speed>, <max_speed>, STATUS_OK,), ... ]
 * lsm.LocalDisk.power_supplies_get("/dev/sda")
    # Return [("<power_supply_name_location>", 
                STATUS_OK/STATUS_ERROR), ...]
joehandzik commented 8 years ago

@cathay4t Thanks for taking a look at Blaine's code. I agree that we should be able to get most of this information via standard methods as well as via hpssacli, the main reason why we erred on the side of starting with the hpssacli route is because it would work in RAID mode too. So, keep that in mind as we go forward with this work. I agree with you that we need LocalDisk methods on top of whatever we might put in the standard plugin interface.

As far as use cases are concerned...we work heavily in HPC right now, and JBODs are used to minimize per-node licensing costs and maximize capacity. So, many of our configurations going forward will involve multiple daisy-chained high-capacity JBODs attached to x86 servers. I think we have a shot with libstoragemgmt to provide a significantly better user experience than with the current toolchains that exist today, particularly for this HPC market (though as I've discussed before, libstoragemgmt is useful for other scale-out solutions like Ceph as well). That's how I've justified my involvement here with my bosses. :)

cathay4t commented 8 years ago

@joehandzik Thanks for the inspiring use cases. I agree both plugin and library API should be able to query link speed, fan, sensors, power supply and etc. I will try to allocate time for the library API mentioned above.

@joehandzik @BlaineEXE If any of you are working on plugin or library API mentioned above, please create an issue and assign to yourself with simple notes in order to eliminate overlap. And we should discuss the API design at early stage instead of during PR review when all patches are done.