CCI-MOC / xdmod-cntr

A project to prototype the use of XDMOD with OpenStack and OpenShift on the MOC
1 stars 5 forks source link

xdmod ticket #29300 setup of xdmod-openstack-scripts #202

Open rob-baron opened 1 year ago

rob-baron commented 1 year ago

Robert Bartlett Baron, reported over 1 year ago I am trying to figure out how to install the xdmod-openstack-scripts, as I am uncertain as to how after reviewing the code. Do you have more explicit instructions for this? Robert Bartlett Baron , said over 1 year ago Noticed that the script hypervisor_facts.py is authenticating using keystone v1 which has been deprecated and is no longer included the OpenStack versions that we are running. Do you have a updated version of the script?

Gregary Dean , said over 1 year ago Ticket: https://help.xdmod.org/support/tickets/29300

Hi Robert,

I'm sorry I haven't answered your questions sooner. I've had some other issues taking priority over the past few weeks.

I've merged your most recent ticket about the OpenStack scripts into this ticket and will try and answer your questions here.

The OpenStack scripts contain a script that exports OpenStack events to a file. The script provides two ways to retrieve the events from OpenStack, either the OpenStack API or connecting to the database directly. If you want to use the API, there are three patch files that need to be applied. These patches make it possible to get all the necessary events from the API. If you do not want to install the patches, you can add the -D or --use-db flag when running the script and the events will be retrieved from the database instead.

As for the question about the API versions, were you getting any specific errors when trying to run the scripts? I looked through the documentation again and it seems that the keystoneclient should return either a v2 or v3 class depending on which one you have installed. It does seem that the celiometerclient and novaclient are using specific versions. I'm looking to see if I can make that more flexible to support both older and newer versions of OpenStack.

Can you share what you have refactored? It could be helpful to see what you have changed.

We are still supporting OpenStack but we haven't gotten time to look at the newest release yet. As for Kubernetes, we do not have plans to support that at this time. Out of curiosity, what sort of things would you want to track as far as Kubernetes goes? Also, what version of OpenStack are you running?

I saw you submitted another ticket with some errors you were getting as well. I'm going to look at that next and see what we can do to help.

Again, I'm sorry about the delay in responding.

Greg Dean Scientific Programmer Center for Computational Research University at Buffalo Robert Bartlett Baron , said over 1 year ago We are currently running Queens and Tran.

Neither Queens nor Tran come with either ceilometer or panko. In other words, neither the database nor the api you are using to pull your data exist in either version. Furthermore, it is my understanding that the database is not guaranteed to stay the same between versions of OpenStack, so . When I ran the scripts it was not using v3 even though I have v2 and v3 installed.

I can rework the scripts to support both, but I would want to test the older version against the correct version of OpenStack. I can either share a GitHub repository with you, or submit a draft of what I am working on against https://github.com/ubccr/xdmod-openstack-scripts.git

We also have a separate monitoring service listening for the events. After getting a minimal viable product for the connection to xdmod, I am considering changing this to a queue that receives alerts from the monitoring service. Having something that is event driven would seem to be advantageous. This would also support the multiple OpenShift clusters we are planning to run as they would have a different monitoring service.

Now in terms of Kubernetes/OpenShift we have a set of quota that I expect we would eventually want to report on. Here is the list:

:persistentvolumeclaims

:requests.storage

:requests.ephemeral-storage

:limits.ephemeral-storage

:replicationcontrollers

:resourcequotas

:services

:services.loadbalancers

:services.nodeports

:secrets

:configmaps

:openshift.io/imagestreams

BestEffort:pods

NotBestEffort:pods

NotBestEffort:requests.memory

NotBestEffort:limits.memory

NotBestEffort:requests.cpu

NotBestEffort:limits.cpu

Terminating:pods

Terminating:requests.memory

Terminating:limits.memory

Terminating:requests.cpu

Terminating:limits.cpu

NotTerminating:pods

NotTerminating:requests.memory

NotTerminating:limits.memory

NotTerminating:requests.cpu

NotTerminating:limits.cpu

Robert Bartlett Baron , said over 1 year ago I put the code here:

https://github.com/rob-baron/xdmod-openstack-scripts

  1. the hypervisor script is producing the same output as described here: (https://open.xdmod.org/9.5/cloud.html) in the section labeled "Cloud Resources Json file. the event reporting script is not complete and doesn't do exactly the same thing at the script in the repository. I just renamed it with an moc_ prefix for no good reason. I had questions regarding the structure produced by the event reporting script as described in (https://open.xdmod.org/9.5/cloud.html) in the section labeled "Details of generic file format for ingestion.

a. account - is this the project id (uuid)? In OpenStack the combination of domain/project_name and domain/user_name are unique - but these require 2 fields, where as using the project_id or user_id are unique identifiers on a given OpenStack instance. Since this needs to be rolled up to a project, and users are not unique to projects it makes sense to me that this should the the project.

b. instance_type:name - is this the instance id (uuid)? Likewise for instances, it is the combination of domain/instance name that is unique or the instance_id - or is this the flavor name?

c. record_type - not sure what goes in here as the modw_cloud database didn't get created.

d. root_type - here again, not sure what you mean by "Type of storage ... "

e. block_device:account - I'm assuming this is the project id of the project that owns the block device.

f. block_device:user - I'm assuming this is the user id of the user that created the block device.

I haven't implemented the heartbeat yet. Not sure why you need the heart beat. You have when the instance starts and when the instance ends, and you can also get when the instance is suspended. But since this is polling, I can easily check to see if the instance is active and generate one. Since we don't have a service like panko in OpenStack, we were starting to implement our reporting based what we used to monitor OpenStack (queens) (zabbix). What we will be using for train is STF - but that isn't set up quite yet. There is yet another monitoring system we are using to monitor our kubernetes (OpenShift) clusters. We will need to extend this to report on object store usage.

Gregary Dean , said over 1 year ago Ticket: https://help.xdmod.org/support/tickets/29300

Hi Robert,

Thanks for sending this. I really appreciate it.

I do have a question about your OpenStack installations. Are you saying that you don't have Ceilometer or Panko installed for your clouds that run Queens and Train or that Ceilometer and Panko aren't available at all for Queens or Train? It does seem that the ceilometer API has been deprecated and removed as of Queens so that does present a problem. Ceilometer itself does seem to still be available. I do see that Panko has been depreciated but that was recently in the Xena release.

You're right about the OpenStack database not being guaranteed to be the same between versions. The database option is there mainly because we've seen a couple of instances where the API was not as performant as needed and the database was. That occurred in older versions of OpenStack so it might not be much of an issue anymore.

Before I answer your questions I will say that it might be better for you to try and copy the OpenStack format instead of the Generic format. Our docs on the xdmod.org do not list the OpenStack format but an example of it can be found in the git repo.

https://github.com/ubccr/xdmod/blob/xdmod10.0/tests/artifacts/xdmod/referencedata/openstack/2018-04-17T00:00:00_2018-04-30T23:59:59.json

There are a lot of events in that file that we filter out, mostly we just use any of the compute. and volume. events. That allows for some more OpenStack-specific data, such as the domain. The OpenStack format also for separate fields for the project_name and project_id and the user_id and user_name.


a. account - is this the project id (uuid)? In OpenStack the combination of domain/project_name and domain/user_name are unique - but these require 2 fields, where as using the project_id or user_id are unique identifiers on a given OpenStack instance. Since this needs to be rolled up to a project, and users are not unique to projects it makes sense to me that this should the the project.

This should be the project_id.


b. instance_type:name - is this the instance id (uuid)? Likewise for instances, it is the combination of domain/instance name that is unique or the instance_id - or is this the flavor name?

This is the flavor name.

c. record_type - not sure what goes in here as the modw_cloud database didn't get created.

You can see what the values should be here. The record_type field has the values to use. We don't use this for anything at this point in time.

e. block_device:account - I'm assuming this is the project id of the project that owns the block device.

Yes. That is correct.

f. block_device:user - I'm assuming this is the user id of the user that created the block device.

Yes. That is correct.

  1. I haven't implemented the heartbeat yet. Not sure why you need the heart beat. You have when the instance starts and when the instance ends, and you can also get when the instance is suspended. But since this is polling, I can easily check to see if the instance is active and generate one.

We use the heartbeat to make sure a VM is still active. We've seen issues before where a VM, in either OpenStack or other cloud systems, will just disappear or crash without sending an end event.

​ As I said above, I think it might be easier to try and get this data in the OpenStack format rather than the Generic format. One other thing to note is that the event_type needs to be a specific string. For the generic format, this file, https://github.com/ubccr/xdmod/blob/xdmod10.0/configuration/etl/etl_data.d/cloud_common/event_type.json, lists all the event types that can be specified. The event_type field should map to one of the values in the event_type column. If you use the OpenStack format, you'll need to get the OpenStack event name. This file, https://github.com/ubccr/xdmod/blob/xdmod10.0/configuration/etl/etl_data.d/cloud_openstack/openstack_event_map.json, list all the OpenStack event types we use and maps them to an event in event_type.json file.

I going to look through this a bit more to see if anything sticks out but I wanted to get a response to you today.

Thanks again, Greg Dean Robert Bartlett Baron , said over 1 year ago Great, I will start working with that format.

I was wondering how the event_type.json and openstack_event_map.json fit in. Thanks for clarifying. Robert Bartlett Baron , said over 1 year ago I asked about ceilometer and panko to our folks who run OpenStack. With queeens, ceilometer is gone from their perspective though they will need to see if any components still exist. I am under the impression that the ceilometer collection agents may be used by a different project. Panko is deprecated, though still installed in queens. However, it is turned off as it is constantly crashing due to the amount of data that it is storing. We are still waiting to get accounts on Train. Robert Bartlett Baron , said over 1 year ago In implementing this script, when I ingest data I get a bunch of warnings like

2022-04-25 03:09:14 [warning] Record 695: String value found, but an integer is required 2022-04-25 03:09:14 [warning] Record 696: NULL value found, but a string is required, String value found, but an integer is required, NULL value found, but a string is required

Any suggests to figure out which fields are the ones that are not passing Robert Bartlett Baron , said about 1 year ago Ok, I eventually figured this one out. We can close this ticket for now, though it would be nice to have a better error message. This ticket has been Closed | about 1 year ago Reply Agent Working on This Ticket

Gregary Dean Scientific Programmer Satisfaction Rating We'd love to hear how your experience with our Support Team went. Whether or not your issue was resolved as you'd hoped, do you feel we did a good job in addressing your concerns? You can let us know by clicking on one of the three options below

Great Just Okay Needs Improvement Ticket details Who do you need help from?

Type Question

Ticket Status This ticket has been Closed

Priority Low

Assigned to Gregary Dean

Location of Script

Job ID Number

Application Used

Error Messages

Home Solutions Tickets