IBM / CAST

CAST can enhance the system management of cluster-wide resources. It consists of the open source tools: cluster system management (CSM) and burst buffer.
Eclipse Public License 1.0
27 stars 34 forks source link

csm fvt test case to handle SSDs on the compute nodes during inventor… #963

Closed williammorrison2 closed 4 years ago

williammorrison2 commented 4 years ago

…y collection.

Purpose

_Adding in a test case related to csm_node_attributes_querydetails on all compute nodes and checking the ssd size

Origin

This test case will catch the issue related to #945

How to Test

Run tests in the CSM FVT environment

  1. Ran a test case in the CSM FVT environment using 4 compute nodes (1 node with no SSDs).
  2. Ran another test using a node in the inventory with the size value set to -1. (to show the FAILED case)

Screenshots

  1. Test case #1 (with SKIPPED flag)
    
    Finished. Cleaning up...
    Test complete: rc=0
    ------------------------------------------------------------
    [2020-09-03 10:43:13.3119] Test Case 1:   csm_node_resources_query_all:                                                                                 PASS
    [2020-09-03 10:43:13.3144] Test Case 1:   check node_ready=n:                                                                                           PASS
    [2020-09-03 10:43:13.3313] Test Case 2:   Calling csm_node_resources_query on 1 node:                                                                   PASS
    [2020-09-03 10:43:13.3482] Test Case 3:   csm_node_resources_query on all nodes:                                                                        PASS
    [2020-09-03 10:43:13.3639] Test Case 4:   Calling csm_node_attributes_query on 1 node:                                                                  PASS
    [2020-09-03 10:43:13.3797] Test Case 5:   Calling csm_node_attributes_query on all nodes:                                                               PASS
    [2020-09-03 10:43:13.4112] Test Case 6:   Calling csm_node_attributes_update to state=IN_SERVICE on all nodes:                                          PASS
    [2020-09-03 10:43:13.4273] Test Case 7:   Calling csm_node_attributes_query on all nodes:                                                               PASS
    [2020-09-03 10:43:13.4297] Test Case 7:   Checking for state=IN_SERVICE:                                                                                PASS
    [2020-09-03 10:43:13.4408] Test Case 8:   Calling csm_node_query_state_history on 1 node:                                                               PASS
    [2020-09-03 10:43:13.4444] Test Case 8:   Checking for state=IN_SERVICE and CSM_API:                                                                    PASS
    [2020-09-03 10:43:13.4588] Test Case 9:   Calling csm_node_attributes_query_details on 1 node:                                                          PASS
    [2020-09-03 10:43:13.4706] Test Case 10:  csm_node_attributes_query_details on all nodes (error expected):                                              PASS
    [2020-09-03 10:43:13.4992] Test Case 10a: csm_node_attributes_query_details on c650f99p18 check ssd size - 1600321314816:                               PASS
    [2020-09-03 10:43:13.5226] Test Case 10a: csm_node_attributes_query_details on c650f99p26 check ssd size - 1600321314816:                               PASS
    [2020-09-03 10:43:13.5470] Test Case 10a: csm_node_attributes_query_details on c650f99p28 check ssd size - 1600321314816:                               PASS
    [2020-09-03 10:43:13.5600] Test Case 10a: csm_node_attributes_query_details on c650f99p36 check ssd size - No SSDs:                                  SKIPPED
    [2020-09-03 10:43:13.5705] Test Case 11:  Calling csm_node_attributes_query_history on 1 node:                                                          PASS
    [2020-09-03 10:43:13.5808] Test Case 12:  Calling csm_node_attributes_query_history on all nodes (error expected):                                      PASS
    [2020-09-03 10:43:13.6281] Test Case 13:  calling csm_node_delete:                                                                                      PASS
    RECOVERING c650f99p18...
    SUCCESS
    ------------------------------------------------------------
                node Bucket COMPLETED
    ------------------------------------------------------------
    Additional Flags:

2. Maunually update the csm_ssd record to indicate `-1` value.

csmdb=> UPDATE csm_ssd set size='-1' where node_name='c650f99p26'; UPDATE 1

csmdb=> select * from csm_ssd where node_name='c650f99p26'; -[ RECORD 1 ]-----------------+------------------------------------- serial_number | S3RVNA0K400104 node_name | c650f99p26 update_time | 2020-09-03 10:50:52.170803 device_name | PCIe3 1.6TB NVMe Flash Adapter II x8 pci_bus_id | 0030:01:00.0 fw_ver | MN12MN12 size | -1 wear_lifespan_used | 0 wear_total_bytes_written | 3748815360000 wear_total_bytes_read | 4817360384000 wear_percent_spares_remaining | 100

Ran the `./node.sh` bucket to test

Finished. Cleaning up... Test complete: rc=0

[2020-09-03 10:52:03.9834] Test Case 1: csm_node_resources_query_all: PASS [2020-09-03 10:52:03.9859] Test Case 1: check node_ready=n: PASS [2020-09-03 10:52:04.0036] Test Case 2: Calling csm_node_resources_query on 1 node: PASS [2020-09-03 10:52:04.0212] Test Case 3: csm_node_resources_query on all nodes: PASS [2020-09-03 10:52:04.0375] Test Case 4: Calling csm_node_attributes_query on 1 node: PASS [2020-09-03 10:52:04.0538] Test Case 5: Calling csm_node_attributes_query on all nodes: PASS [2020-09-03 10:52:04.0834] Test Case 6: Calling csm_node_attributes_update to state=IN_SERVICE on all nodes: PASS [2020-09-03 10:52:04.0996] Test Case 7: Calling csm_node_attributes_query on all nodes: PASS [2020-09-03 10:52:04.1021] Test Case 7: Checking for state=IN_SERVICE: PASS [2020-09-03 10:52:04.1136] Test Case 8: Calling csm_node_query_state_history on 1 node: PASS [2020-09-03 10:52:04.1173] Test Case 8: Checking for state=IN_SERVICE and CSM_API: PASS [2020-09-03 10:52:04.1316] Test Case 9: Calling csm_node_attributes_query_details on 1 node: PASS [2020-09-03 10:52:04.1437] Test Case 10: csm_node_attributes_query_details on all nodes (error expected): PASS [2020-09-03 10:52:04.1730] Test Case 10a: csm_node_attributes_query_details on c650f99p18 check ssd size - 1600321314816: PASS [2020-09-03 10:52:04.1989] Test Case 10a: csm_node_attributes_query_details on c650f99p26 check ssd size - -1: FAILED


This should pick up any inventory records return the `-1` value. This would catch the issue related to #945
## Open Questions and Pre-Merge TODOs
_Make sure you attempted to do the following:_
- [x] Assign @besawn  to review the code

I also modified the `csmtest/include/functions.sh` script to include time stamps in the CSM FVT output logs for each test case. This just gives a better understanding of when each of the test cases were executed.