intel / intelRSD

Intel® Rack Scale Design Reference Software
http://intel.com/IntelRSD
101 stars 55 forks source link

environment for running PODM and PSME without RMM where PSME handles RMM's functionalities #5

Closed Vishesh-GIT closed 8 years ago

Vishesh-GIT commented 8 years ago

We have a setup where PODM runs in one linux box and psme-rest-server, agents psme-compute-intel, psme-storage and psme-chassis are running in another linux box. we don,t have RMM running and we want PSME to handle RMM's functionalities and serve RMM's redfish APIs. In psme-rest-server.json "rmm-present" flag is set to false indicating there is no RMM. At PODM /tmp/services.list contains PSME service URL( http://PSME IP addr:8888/redfish/v1 psme). since "rmm-present" flag set to false, we expected to receive RMM related redfish APIs to be sent from PODM to PSME rest server, but no RMM APIs were sent from PODM. we tried by adding an entry in /tmp/services.list ( http://PSME IP addr:8888/redfish/v1 rmm ) and modified the port number in /etc/pod-manager/service-connection.json to same port number as PSME service, but that also not helpful.

so how can we setup this environment where there is no RMM and PSME has to receive and handle RMM's redfish APIs?

zhlicen commented 8 years ago

Just personal opinion:

maciejro commented 8 years ago

Hi @Vishesh-GIT,

First of all, some clarification:

Secondly, could you provide your scenario that you want to execute? That would be very helpful.

Hi @zhlicen,

About the first point - you are right - PODM is able to detect service IP change basing on Service UUID, so an overwrite may happen.

About the second one - this is not supported scenario and we cannot guarantee how PODM will react to such configuration.

gopakumar-thekkedath commented 8 years ago

Hi @maciejro,

Thanks for providing the details, it would be very helpful if you can address the below too.

To give some perspective of our current setup, we are executing the PSME service and the compute agent in a x86 box which is attached to our rack. So, essentially there would be single PSME service for the whole rack and the agent would communicate with the BMCs in the blades present in the rack via RMCP. We also want to introduce the RMM so that the chassis related information can be provided to the PODM. We do not have a RMC at this point of time, hence the RMM also would execute in the same x86 box where the PSME is present. As we require only a minimal implementation of RMM at this point of time(provide information about drawers in the rack and also represent power and thermal zone related information), we thought of building a single binary(I will refer to this as the combined AMC binary) that could receive both PSME and RMM REST APIs and would communicate with both compute and chassis agents as required.

1) You have concurred to @zhlicen's first point about the need for service UUID to be unique, now, if we provide same IP address but different port numbers for PSME and RMM services, would it work? That is, if we have the below entries in /tmp/services.list http://:/redfish/v1 psme http://:/redfish/v1 rmm

And in the combined AMC binary, have two ports (PSME_port_number, RMM_port_number) opened, then can we expect PODM to issue PSME REST APIs to x86box_IP_addr:PSME_port_number? and RMM REST APIs to x86box_IP_addr:RMM_port_number?

2) Does your 3rd point imply that, even if there is a RMM present and we have added the RMM IP and port number details in /tmp/services.list, the PODM will not issue any RMM REST APIs? Or did I interpret it wrongly? (This query is not specific to our setup but for any Rack Scale implementation).If this indeed is the case, then how can one provide the chassis related information(which includes power and thermal zone information too) to PODM?

3) In the RMM API specification guide, the table under 'API structure and relation' section shows, PowerZone and ThermalZone APIs only against 'rack'. There are no similar APIs against 'drawer' being shown. Does this mean that, we can represent the power and thermal zone information only at rack level?

gopakumar-thekkedath commented 8 years ago

Under the point 1 in my earlier post, the example IP address and port number that was to be present in the /tmp/services.lst file got chopped of, it should be as below

http://x86box_IP_addr:PSME_port_number/redfish/v1 psme http://x86box_IP_addr:RMM_port_number/redfish/v1 rmm

maciejro commented 8 years ago

Hi @gopakumar-thekkedath,

Thanks for the information, that was very helpful! Also some good news: we've tested PSME with 'Rack' Chassis available on its API and it worked - however there is no connection between 'Pod' Chassis and 'Rack' Chassis (in meaning of 'ContainedBy' relationship).

Here is my response to your questions:

1) You have concurred to @zhlicen's first point about the need for service UUID to be unique, now, if we provide same IP address but different port numbers for PSME and RMM services, would it work? That is, if we have the below entries in /tmp/services.list http://x86box_IP_addr:PSME_port_number/redfish/v1 psme http://x86box_IP_addr:RMM_port_number/redfish/v1 rmm

And in the combined AMC binary, have two ports (PSME_port_number, RMM_port_number) opened, then can we expect PODM to issue PSME REST APIs to x86box_IP_addr:PSME_port_number? and RMM REST APIs to x86box_IP_addr:RMM_port_number?

It will work properly, since those endpoints use different ports. There shall be no problem in communication between PODM and such box of two services.

2) Does your 3rd point imply that, even if there is a RMM present and we have added the RMM IP and port number details in /tmp/services.list, the PODM will not issue any RMM REST APIs? Or did I interpret it wrongly? (This query is not specific to our setup but for any Rack Scale implementation).If this indeed is the case, then how can one provide the chassis related information(which includes power and thermal zone information too) to PODM?

Please let me add some clarification to my previous answer - PODM will read whole RMM API (excluding 'Drawer' Chassis resources), but will not be able to invoke RMM API actions on it using HTTP PUT, PATCH, POST, DELETE methods.

3) In the RMM API specification guide, the table under 'API structure and relation' section shows, PowerZone and ThermalZone APIs only against 'rack'. There are no similar APIs against 'drawer' being shown. Does this mean that, we can represent the power and thermal zone information only at rack level?

PODM is not reading ‘Drawer’ Chassis data from RMM, so you can put ThermalZone/PowerZone information on drawer resources exposed by PSME.

gopakumar-thekkedath commented 8 years ago

Hi @maciejro,

Thanks for the replies, we are finding these discussion extremely helpful. You mentioned that,

we've tested PSME with 'Rack' Chassis available on its API and it worked - however there is no connection between 'Pod' Chassis and 'Rack' Chassis (in meaning of 'ContainedBy' relationship).

We are also finding that, we can provide rack Chassis from PSME, but in our case, the PODM is not issuing any power or thermal related APIs despite us providing the information that the Chassis contains Power and Thermal zones. This behavior is seen for chassis of all types.

Is the PODM built from the sources in this repo expected to invoke power/thermal commands, if the Chassis resource read by it from PSME advertises those capabilities?(which would mean that, we need to relook at our PSME code.) or did you had to make any changes in PODM that is yet to be committed here to get that going?(In which case, we will wait for those changes to come in).

maciejro commented 8 years ago

Hi @gopakumar-thekkedath,

PODM is able to read Power and Thermal resources from PSME Chassis and, if I understood you correctly, this scenario works properly in your case. However, PODM cannot invoke any HTTP actions on those resources (as you can see in Intel® Rack Scale Design Pod Manager API Specification). This actually means that PODM does not issue any POST / PATCH / PUT / DELETE requests, neither internally nor as an effect of user action on PODM Northbound API. We do not plan to implement such feature in nearest future.

gopakumar-thekkedath commented 8 years ago

Hi @maciejro ,

No, in our case, the PODM is not issuing GET for power and thermal zone, despite us providing power and thermal information in the chassis object. Do you have any thoughts on what could be causing that?

maciejro commented 8 years ago

Hi @gopakumar-thekkedath,

Could you provide JSON with Chassis, Thermal and Power Zones and collections of them exposed by your implementation of PSME? That would be very helpful!

Vishesh-GIT commented 8 years ago

Hi @maciejro As you requested, i have attached DOCX doc that contains the JSON output for the redfish API response by North-bound of the PSME rest server which is exposed to PODM as well as Pod1 chassis,Rack-chassis and Drawer chassis exposed by North-bound of PODM. Also i have attached orientdb snapshot that show the contains chassis relationship built by the PODM. i hope this is helpful for your analysis and guide us further.

podm-orientdb-snapshot PSME-chassis-thermal-power-JSON.docx PODM-chassis-thermal-power-JSON.docx

zhlicen commented 8 years ago

The PODM code shows that Power&Thermal Zones will be parsed, only if they are defined under the Oem section.

maciejro commented 8 years ago

Hi @Vishesh-GIT, @zhlicen,

The answer that @zhlicen has provided is correct. PODM Southband API is aligned to RMM API (you can find more information in Intel® Rack Scale Design RMM REST-API Specification) so it is expected that Power & Thermal Zones can be read only via Oem object. You can look at ChassisResourceImpl class to see how those resources are loaded to PODM.

Vishesh-GIT commented 8 years ago

@zhlicen @maciejro Thanks a lot for your valuable input, i have added relevant information in the Oem section and able to receive both Thermal and Power zone API and same visible in the orientdb.

maciejro commented 8 years ago

Hi @Vishesh-GIT,

That's great news! We hope that you will enjoy using Rack Scale Design solution!

If there is no more questions related to this topic, I will close this issue tomorrow.