2023-04-29 13:45:28 [debug] Stored object ETL\Configuration\EtlConfiguration (/etc/xdmod/etl/etl.json) in APCu cache with key ETL\Configuration\EtlConfiguration|/etc/xdmod/etl/etl.json|cb328615c5b43bfbe1404ea2dbb0a7fc in 1.465810s
2023-04-29 13:45:30 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json'
2023-04-29 13:45:30 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.001082s
2023-04-29 13:45:32 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json'
2023-04-29 13:45:32 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.001497s
2023-04-29 13:45:32 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/session_records.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/session_records.json'
2023-04-29 13:45:32 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.109032s
2023-04-29 13:45:33 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day.json'
2023-04-29 13:45:33 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.004374s
2023-04-29 13:45:39 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfactby.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfactby.json'
2023-04-29 13:45:39 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.004379s
2023-04-29 13:45:43 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day_sessionlist.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day_sessionlist.json'
2023-04-29 13:45:43 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.002177s
CREATE TABLE modw_cloud.tmp_event_reconstructed_1682775949 LIKE modw_cloud.event_reconstructed; ALTER TABLE modw_cloud.tmp_event_reconstructed_1682775949 DISABLE KEYS; LOAD DATA LOCAL INFILE '/tmp/modw_cloud.event_reconstructed.data.ts_1682775949.114201639119VqyP' INTO TABLE modw_cloud.tmp_event_reconstructed_1682775949 FIELDS TERMINATED BY 0x1e OPTIONALLY ENCLOSED BY 0x1f ESCAPED BY 0x5c LINES TERMINATED BY 0x1d (resource_id,instance_id,start_time_ts,start_event_id,end_time_ts,end_event_id); SHOW WARNINGS; INSERT INTO modw_cloud.event_reconstructed (resource_id,instance_id,start_time_ts,start_event_id,end_time_ts,end_event_id) SELECT resource_id,instance_id,start_time_ts,start_event_id,end_time_ts,end_event_id FROM modw_cloud.tmp_event_reconstructed_1682775949 ON DUPLICATE KEY UPDATE resource_id=VALUES(resource_id),instance_id=VALUES(instance_id),start_time_ts=VALUES(start_time_ts),start_event_id=VALUES(start_event_id),end_time_ts=VALUES(end_time_ts),end_event_id=VALUES(end_event_id); DROP TABLE modw_cloud.tmp_event_reconstructed_1682775949;
FROM modw_cloud.event AS e WHERE instance_id IN (SELECT DISTINCT instance_id from modw_cloud.event WHERE last_modified > "2018-01-01 12:30:00") AND event_type_id IN (1,2,3,4,6,5,7,8,17,19,20,44,45,54,55,56,57,58,59,60,61,62,63,64,16)
UNION ALL
SELECT 0,0,0,0,0,0,0,0
ORDER BY 1 DESC, 2 DESC, 3 ASC, 4 DESC
2023-04-29 13:46:08 [debug] Loaded 2,393 records into 'event_reconstructed'
2023-04-29 13:46:08 [debug] Loaded 1 files in 0s
2023-04-29 13:46:08 [info] ETL\Ingestor\CloudStateReconstructorTransformIngestor: Processed 2,393 records (634,912 source records) in 19s
2023-04-29 13:46:08 [info] Returning buffered query mode to: true
(SELECT id FROM modw_cloud.processor_buckets pb WHERE itt.num_cores BETWEEN pb.min_processors AND pb.max_processors) AS processorbucket_id,
(SELECT id FROM modw_cloud.memory_buckets mb WHERE itt.memory_mb BETWEEN mb.min_memory AND mb.max_memory) AS memorybucket_id,
itt.disk_gb AS disk_gb,
FLOOR(e.start_time_ts) AS start_time_ts,
FLOOR(e.end_time_ts) AS end_time_ts,
YEAR(FROM_UNIXTIME(e.start_time_ts)) * 100000 + DAYOFYEAR(FROM_UNIXTIME(e.start_time_ts)) AS start_day_id,
YEAR(FROM_UNIXTIME(e.end_time_ts)) * 100000 + DAYOFYEAR(FROM_UNIXTIME(e.end_time_ts)) AS end_day_id,
FLOOR(e.end_time_ts) - FLOOR(e.start_time_ts) AS wallduration,
ev.person_id AS person_id,
ev.systemaccount_id AS systemaccount_id,
ev.submission_venue_id AS submission_venue_id,
ev.domain_id AS domain_id,
ev.service_provider AS service_provider,
a.account_id AS account_id,
a.principalinvestigator_person_id AS principalinvestigator_person_id,
a.fos_id AS fos_id,
ev.host_id AS host_id
FROM modw_cloud.event_reconstructed AS e
JOIN modw_cloud.event AS ev ON e.start_event_id = ev.event_type_id AND e.start_time_ts = ev.event_time_ts AND e.instance_id = ev.instance_id AND e.resource_id = ev.resource_id
JOIN modw_cloud.instance AS it ON e.instance_id = it.instance_id AND e.resource_id = it.resource_id
JOIN modw_cloud.instance_data AS itd ON itd.resource_id = it.resource_id AND itd.event_id = ev.event_id
JOIN modw_cloud.instance_type AS itt ON itt.resource_id = it.resource_id AND itt.instance_type_id = itd.instance_type_id
LEFT JOIN modw_cloud.account AS a ON it.account_id = a.account_id
ORDER BY resource_id asc, instance_id asc, start_time_ts asc
ON DUPLICATE KEY UPDATE instance_id=VALUES(instance_id),start_time=VALUES(start_time),start_event_type_id=VALUES(start_event_type_id),end_time=VALUES(end_time),end_event_type_id=VALUES(end_event_type_id),resource_id=VALUES(resource_id),instance_type=VALUES(instance_type),instance_type_id=VALUES(instance_type_id),num_cores=VALUES(num_cores),memory_mb=VALUES(memory_mb),processorbucket_id=VALUES(processorbucket_id),memorybucket_id=VALUES(memorybucket_id),disk_gb=VALUES(disk_gb),start_time_ts=VALUES(start_time_ts),end_time_ts=VALUES(end_time_ts),start_day_id=VALUES(start_day_id),end_day_id=VALUES(end_day_id),wallduration=VALUES(wallduration),person_id=VALUES(person_id),systemaccount_id=VALUES(systemaccount_id),submission_venue_id=VALUES(submission_venue_id),domain_id=VALUES(domain_id),service_provider=VALUES(service_provider),account_id=VALUES(account_id),principalinvestigator_person_id=VALUES(principalinvestigator_person_id),fos_id=VALUES(fos_id),host_id=VALUES(host_id)
This error would happen if you have a VM whose number of processors is outside of the ranges that exists in the modw_cloud.processor_buckets table. Can you run "SELECT * FROM modw_cloud.processor_buckets" and see what it returns? This is the default for the table:
Is there an easy way to find the records that may have the number of processors that, say, zero to 1?
Gregary Dean , said 4 months ago
Cc: joachimw@bu.edu, msd@bu.edu
Ticket: https://help.xdmod.org/support/tickets/32768
Hi Rob,
Here are two queries that can help find that information. The first finds any openstack flavor whose number of CPU's is less than 1. This will not get you the individual VM's, just the Openstack flavors.
SELECT * FROM modw_cloud.instance_type WHERE num_cores < 1 AND resource_id != -1
This second one will get you both the UUID and Openstack flavor for a VM.
SELECT
i.provider_identifier,
itt.display
FROM
event AS ev
JOIN
instance AS i ON ev.instance_id = i.instance_id
JOIN
instance_data AS itd ON itd.resource_id = ev.resource_id AND itd.event_id = ev.event_id
JOIN
instance_type AS itt ON itt.instance_type_id = itd.instance_type_id
WHERE
ev.event_type_id IN (1,7,57,56,58,60,63,2,8,16,20,61,59,4,3,5,44,54,62,64,6,17,19,45,55)
AND
itt.num_cores < 1
GROUP BY
1
ORDER BY
2 DESC
One of these should help with finding the issue.
-greg
Robert Bartlett Baron , said 3 months ago
Interestingly enough the result of the first query turned up 8 such flavors:";
+-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+
| resource_id | instance_type_id | instance_type | display | description | num_cores | memory_mb | disk_gb | start_time | end_time |
+-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+
| 2 | 13 | unknown flavor(0cdf836c-e0a6-48ba-b98d-e8ed7e313acf) | unknown flavor(0cdf836c-e0a6-48ba-b98d-e8ed7e313acf) | NULL | 0 | 0 | 0 | 1644361873.000000 | NULL |
| 2 | 14 | unknown flavor(1202a547-f2d7-4f8c-bb2b-e55bcce24f0f) | unknown flavor(1202a547-f2d7-4f8c-bb2b-e55bcce24f0f) | NULL | 0 | 0 | 0 | 1649795786.000000 | NULL |
| 2 | 15 | unknown flavor(255ae564-1164-43eb-b3da-13a115f4c722) | unknown flavor(255ae564-1164-43eb-b3da-13a115f4c722) | NULL | 0 | 0 | 0 | 1644526590.000000 | NULL |
| 2 | 16 | unknown flavor(3dfd915f-4730-434f-aec9-f1fb738423fd) | unknown flavor(3dfd915f-4730-434f-aec9-f1fb738423fd) | NULL | 0 | 0 | 0 | 1646354299.000000 | NULL |
| 2 | 17 | unknown flavor(8dcbf98b-412c-44b7-a56a-cb8dbec11d4b) | unknown flavor(8dcbf98b-412c-44b7-a56a-cb8dbec11d4b) | NULL | 0 | 0 | 0 | 1647996018.000000 | NULL |
| 2 | 18 | unknown flavor(d3652ccd-713e-40a0-83b6-78888bf10015) | unknown flavor(d3652ccd-713e-40a0-83b6-78888bf10015) | NULL | 0 | 0 | 0 | 1647960881.000000 | NULL |
| 2 | 19 | unknown flavor(e2ec72ca-d9a7-4be8-991b-6c94e109ff95) | unknown flavor(e2ec72ca-d9a7-4be8-991b-6c94e109ff95) | NULL | 0 | 0 | 0 | 1644361910.000000 | NULL |
| 2 | 20 | unknown flavor(f8a78508-9dc5-4374-8f12-4b3ea0037991) | unknown flavor(f8a78508-9dc5-4374-8f12-4b3ea0037991) | NULL | 0 | 0 | 0 | 1646836326.000000 | NULL |
+-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+
8 rows in set (0.03 sec)
Also, from the second query, I see that data is continuing to arrive from VMs that use one of the 8 unknown flavors.
I tried setting those 8 flavors to having 1 cpu, and that failed spectacularly, in that the next time the shredder and ingestor ran, I suddenly had duplicates of those 8 unknown flavors, with the new ones have 0 as the cpu.
So just to bounce the idea off of you, I am planning to do the following:
0) backup the DB
1) change the code that pulls the data to place a 1s in the cpu, memory, and disk
2) have other people resize their VMs to defined flavors
3) work through the shredding tables to add 1's to the flavors
4) reinvest the cloud data
I'm sorry for the delay in answering. I was on vacation for the past week. Those steps sound fine. If you are planning on re-ingesting the data from your log files I would suggest removing the offending instances types from the instance type table and any records from other tables that reference these instance types before re-ingesting. Here are two queries that should help with doing that.
DELETE
ev,
i,
itd,
itt
FROM
event AS ev
JOIN
instance AS i ON ev.instance_id = i.instance_id
JOIN
instance_data AS itd ON itd.resource_id = ev.resource_id and itd.event_id = ev.event_id
JOIN
instance_type AS itt ON itt.instance_type_id = itd.instance_type_id
WHERE
ev.event_type_id IN (1,7,57,56,58,60,63,2,8,16,20,61,59,4,3,5,44,54,62,64,6,17,19,45,55)
AND
itt.num_cores < 1;
DELETE sr FROM modw_cloud.session_records AS sr WHERE sr.instance_type_id IN (13,14,15,16,17,18,19,20)
-greg
Robert Bartlett Baron , said 2 months ago
I'm not reinvesting things from the log files as I would have to correct the log files.
I've gone through the modw_cloud database and moved the references to flavors named with "undefined flavor ..." to one with 1 cpu, 1 ram and 1 disk.
I then rerun the xdmod-ingestor and the script runs to completion including the aggregation.
Do I need to delete the modw_cloud.session_records?
Rob.
Robert Bartlett Baron , said about 2 months ago
I think I have figured out this issue, we can close this one for now.
Robert Bartlett Baron
Any suggestions to fix my cloud aggregation? The debug log is as follows:
sh-4.2$ xdmod-ingestor --aggregate=cloud --last-modified-start-date "2018-01-01 12:30:00" --debug
2023-04-29 13:45:26 [info] Command: '/usr/bin/xdmod-ingestor' '--aggregate=cloud' '--last-modified-start-date' '2018-01-01 12:30:00' '--debug'
2023-04-29 13:45:27 [notice] xdmod-ingestor start (process_start_time: 2023-04-29 13:45:27)
2023-04-29 13:45:27 [debug] Creating data warehouse initilializer
2023-04-29 13:45:27 [info] Aggregating data
2023-04-29 13:45:27 [notice] Aggregating Cloud data
2023-04-29 13:45:27 [debug] Running ETL pipeline "cloud-state-pipeline" with parameters {"last-modified-start-date":"2018-01-01 12:30:00"}
2023-04-29 13:45:27 [debug] Loading configuration file /etc/xdmod/etl/etl.json
2023-04-29 13:45:27 [debug] Parsing /etc/xdmod/etl/etl.json
2023-04-29 13:45:27 [debug] Parsed 1 records
2023-04-29 13:45:27 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/acls-import.json
2023-04-29 13:45:27 [debug] Parsing /etc/xdmod/etl/etl.d/acls-import.json
2023-04-29 13:45:27 [debug] Parsed 1 records
2023-04-29 13:45:27 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/acls-xdmod-management.json
2023-04-29 13:45:27 [debug] Parsing /etc/xdmod/etl/etl.d/acls-xdmod-management.json
2023-04-29 13:45:27 [debug] Parsed 1 records
2023-04-29 13:45:27 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/action_state_setup.json
2023-04-29 13:45:27 [debug] Parsing /etc/xdmod/etl/etl.d/action_state_setup.json
2023-04-29 13:45:27 [debug] Parsed 1 records
2023-04-29 13:45:27 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/cloud_ingest_resource_specs.json
2023-04-29 13:45:27 [debug] Parsing /etc/xdmod/etl/etl.d/cloud_ingest_resource_specs.json
2023-04-29 13:45:27 [debug] Parsed 1 records
2023-04-29 13:45:27 [info] ETL\DataEndpoint\DirectoryScanner (name=Open Stack resource specifications, path=${CLOUD_RESOURCE_SPECS_DIRECTORY}): Relative path provided, absolute path recommended
2023-04-29 13:45:28 [debug] Qualifying relative path ${CLOUD_RESOURCE_SPECS_DIRECTORY} with /etc/xdmod/etl/etl_data.d
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/cloud_state_machine.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/cloud_state_machine.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/gateways.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/gateways.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/hpcdb-xdw.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/hpcdb-xdw.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/hpcdb.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/hpcdb.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/ingest_resources.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/ingest_resources.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/jobs.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/jobs.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] (Configuration\JsonReferenceTransformer) Resolved reference 'etl_pipelines.d/jobs-xdw.json' to '/etc/xdmod/etl/etl_pipelines.d/jobs-xdw.json'
2023-04-29 13:45:28 [debug] (Configuration\JsonReferenceTransformer) Resolved reference 'etl_pipelines.d/jobs-xdw.json' to '/etc/xdmod/etl/etl_pipelines.d/jobs-xdw.json'
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/jobs_cloud_common.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/jobs_cloud_common.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/jobs_cloud_generic.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/jobs_cloud_generic.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [info] ETL\DataEndpoint\DirectoryScanner (name=Generic cloud event logs, path=${CLOUD_EVENT_LOG_DIRECTORY}): Relative path provided, absolute path recommended
2023-04-29 13:45:28 [debug] Qualifying relative path ${CLOUD_EVENT_LOG_DIRECTORY} with /etc/xdmod/etl/etl_data.d
2023-04-29 13:45:28 [info] ETL\DataEndpoint\DirectoryScanner (name=Generic volume logs, path=${CLOUD_EVENT_LOG_DIRECTORY}): Relative path provided, absolute path recommended
2023-04-29 13:45:28 [debug] Qualifying relative path ${CLOUD_EVENT_LOG_DIRECTORY} with /etc/xdmod/etl/etl_data.d
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/jobs_cloud_openstack.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/jobs_cloud_openstack.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [info] ETL\DataEndpoint\DirectoryScanner (name=Open Stack event logs, path=${CLOUD_EVENT_LOG_DIRECTORY}): Relative path provided, absolute path recommended
2023-04-29 13:45:28 [debug] Qualifying relative path ${CLOUD_EVENT_LOG_DIRECTORY} with /etc/xdmod/etl/etl_data.d
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/jobs_common.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/jobs_common.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/organizations.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/organizations.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/resource_types.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/resource_types.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/shredder.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/shredder.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/staging.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/staging.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [info] ETL\DataEndpoint\DirectoryScanner (name=usage-directory, path=${STORAGE_LOG_DIRECTORY}): Relative path provided, absolute path recommended
2023-04-29 13:45:28 [debug] Qualifying relative path ${STORAGE_LOG_DIRECTORY} with /etc/xdmod/etl/etl_data.d
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/storage.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/storage.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/test_suite.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/test_suite.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/verify.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/verify.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/xdb.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/xdb.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Loading local configuration file /etc/xdmod/etl/etl.d/xdmod-migration-9_5_0-10_0_0.json
2023-04-29 13:45:28 [debug] Parsing /etc/xdmod/etl/etl.d/xdmod-migration-9_5_0-10_0_0.json
2023-04-29 13:45:28 [debug] Parsed 1 records
2023-04-29 13:45:28 [debug] Stored object ETL\Configuration\EtlConfiguration (/etc/xdmod/etl/etl.json) in APCu cache with key ETL\Configuration\EtlConfiguration|/etc/xdmod/etl/etl.json|cb328615c5b43bfbe1404ea2dbb0a7fc in 1.465810s
2023-04-29 13:45:28 [debug] Running ETL pipeline with script options {"default-module-name":"xdmod","process-sections":["cloud-state-pipeline"],"last-modified-start-date":"2018-01-01 12:30:00"}
2023-04-29 13:45:29 [info] Verifying endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:29 [info] Verifying endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:30 [info] Verifying endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:30 [info] Create action xdmod.cloud-state-pipeline.cloud-state-action (ETL\Ingestor\CloudStateReconstructorTransformIngestor)
2023-04-29 13:45:30 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json
2023-04-29 13:45:30 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json
2023-04-29 13:45:30 [debug] Parsed 1 records
2023-04-29 13:45:30 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json'
2023-04-29 13:45:30 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.001082s
2023-04-29 13:45:30 [info] Verifying action: xdmod.cloud-state-pipeline.cloud-state-action (ETL\Ingestor\CloudStateReconstructorTransformIngestor)
2023-04-29 13:45:30 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:30 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:30 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:32 [debug] Created ETL destination table object for table definition key 'event_reconstructed'
2023-04-29 13:45:32 [debug] Create ETL source query object
2023-04-29 13:45:32 [info] Create action xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql)
2023-04-29 13:45:32 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json
2023-04-29 13:45:32 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json
2023-04-29 13:45:32 [debug] Parsed 1 records
2023-04-29 13:45:32 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/event_reconstructed.json'
2023-04-29 13:45:32 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_state.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.001497s
2023-04-29 13:45:32 [info] Verifying action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql)
2023-04-29 13:45:32 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:32 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:32 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:32 [info] Create action xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor)
2023-04-29 13:45:32 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json
2023-04-29 13:45:32 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json
2023-04-29 13:45:32 [debug] Parsed 1 records
2023-04-29 13:45:32 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/session_records.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/session_records.json'
2023-04-29 13:45:32 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/session_records.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.109032s
2023-04-29 13:45:32 [info] Verifying action: xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor)
2023-04-29 13:45:33 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:33 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:33 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:33 [debug] Created ETL destination table object for table definition key 'session_records'
2023-04-29 13:45:33 [debug] Create ETL source query object
2023-04-29 13:45:33 [debug] No destination_field_map specified
2023-04-29 13:45:33 [debug] Auto-generating destination_field_map using 27 source fields: instance_id, start_time, start_event_type_id, end_time, end_event_type_id, resource_id, instance_type, instance_type_id, num_cores, memory_mb, processorbucket_id, memorybucket_id, disk_gb, start_time_ts, end_time_ts, start_day_id, end_day_id, wallduration, person_id, systemaccount_id, submission_venue_id, domain_id, service_provider, account_id, principalinvestigator_person_id, fos_id, host_id
2023-04-29 13:45:33 [debug] Available fields for table key 'session_records': session_id, instance_id, resource_id, start_time, start_event_type_id, end_time, end_event_type_id, instance_type, instance_type_id, num_cores, memory_mb, processorbucket_id, memorybucket_id, disk_gb, start_time_ts, end_time_ts, start_day_id, end_day_id, wallduration, person_id, systemaccount_id, submission_venue_id, domain_id, last_modified, service_provider, account_id, principalinvestigator_person_id, fos_id, host_id
2023-04-29 13:45:33 [debug] Generated destination_field_map:
Table: session_records
instance_id -> instance_id
resource_id -> resource_id
start_time -> start_time
start_event_type_id -> start_event_type_id
end_time -> end_time
end_event_type_id -> end_event_type_id
instance_type -> instance_type
instance_type_id -> instance_type_id
num_cores -> num_cores
memory_mb -> memory_mb
processorbucket_id -> processorbucket_id
memorybucket_id -> memorybucket_id
disk_gb -> disk_gb
start_time_ts -> start_time_ts
end_time_ts -> end_time_ts
start_day_id -> start_day_id
end_day_id -> end_day_id
wallduration -> wallduration
person_id -> person_id
systemaccount_id -> systemaccount_id
submission_venue_id -> submission_venue_id
domain_id -> domain_id
service_provider -> service_provider
account_id -> account_id
principalinvestigator_person_id -> principalinvestigator_person_id
fos_id -> fos_id
host_id -> host_id
2023-04-29 13:45:33 [info] Create action xdmod.cloud-state-pipeline.CloudEventAggregatorByDay (ETL\Aggregator\JobListAggregator)
2023-04-29 13:45:33 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json
2023-04-29 13:45:33 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json
2023-04-29 13:45:33 [debug] Parsed 1 records
2023-04-29 13:45:33 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day.json'
2023-04-29 13:45:33 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation_by_day.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.004374s
2023-04-29 13:45:33 [info] Verifying action: xdmod.cloud-state-pipeline.CloudEventAggregatorByDay (ETL\Aggregator\JobListAggregator)
2023-04-29 13:45:34 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:36 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:37 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:38 [debug] Create ETL destination aggregation table object
2023-04-29 13:45:38 [debug] Create ETL source query object
2023-04-29 13:45:38 [info] Create action xdmod.cloud-state-pipeline.CloudEventAggregator (ETL\Aggregator\SimpleAggregator)
2023-04-29 13:45:39 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json
2023-04-29 13:45:39 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json
2023-04-29 13:45:39 [debug] Parsed 1 records
2023-04-29 13:45:39 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfactby.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfactby.json'
2023-04-29 13:45:39 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloud_metrics_aggregation.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.004379s
2023-04-29 13:45:39 [info] Verifying action: xdmod.cloud-state-pipeline.CloudEventAggregator (ETL\Aggregator\SimpleAggregator)
2023-04-29 13:45:40 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:41 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:41 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:42 [debug] Create ETL destination aggregation table object
2023-04-29 13:45:42 [debug] Create ETL source query object
2023-04-29 13:45:42 [info] Create action xdmod.cloud-state-pipeline.CloudAggregatorSessionlist (ETL\Ingestor\ExplodeTransformIngestor)
2023-04-29 13:45:43 [debug] Loading configuration file /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json
2023-04-29 13:45:43 [debug] Parsing /etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json
2023-04-29 13:45:43 [debug] Parsed 1 records
2023-04-29 13:45:43 [debug] (Configuration\JsonReferenceTransformer) Resolved reference '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day_sessionlist.json' to '/etc/xdmod/etl/etl_tables.d/cloud_common/cloudfact_by_day_sessionlist.json'
2023-04-29 13:45:43 [debug] Stored object Configuration\Configuration (/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json) in APCu cache with key Configuration\Configuration|/etc/xdmod/etl/etl_action_defs.d/cloud_common/cloudfact_by_day_sessionlist.json|080e8f8664a3addcdf68a2660e1a6e34 in 0.002177s
2023-04-29 13:45:43 [info] Verifying action: xdmod.cloud-state-pipeline.CloudAggregatorSessionlist (ETL\Ingestor\ExplodeTransformIngestor)
2023-04-29 13:45:43 [info] Utility endpoint: ('Utility DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:44 [info] Source endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:44 [info] Destination endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:44 [debug] Created ETL destination table object for table definition key 'cloudfact_by_day_sessionlist'
2023-04-29 13:45:44 [debug] Create ETL source query object
2023-04-29 13:45:44 [info] Empty lock directory specified, using temp directory: /tmp
2023-04-29 13:45:44 [info] Obtaining lock file '/tmp/etlv2_416'
2023-04-29 13:45:44 [notice] Start processing section 'xdmod.cloud-state-pipeline'
2023-04-29 13:45:44 [info] start (action_name: xdmod.cloud-state-pipeline.cloud-state-action, action: xdmod.cloud-state-pipeline.cloud-state-action (ETL\Ingestor\CloudStateReconstructorTransformIngestor), start_date: , end_date: )
2023-04-29 13:45:45 [info] Truncate destination table:
modw_cloud
.event_reconstructed
2023-04-29 13:45:46 [debug] Truncate destination task ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod):
TRUNCATE TABLE
modw_cloud
.event_reconstructed
2023-04-29 13:45:48 [debug] Discover table 'modw_cloud.event_reconstructed'
2023-04-29 13:45:48 [info] Process date interval (1/1) (start: none, end: none)
2023-04-29 13:45:49 [debug] Available Variables: DESTINATION_SCHEMA='modw_cloud', DW_ETL_LOG_RECIPIENT='nobody@massopen.cloud', LAST_MODIFIED='2018-01-01 12:30:00', LAST_MODIFIED_START_DATE='2018-01-01 12:30:00', SOURCE_SCHEMA='modw_cloud', TIMEZONE='UTC', UTILITY_SCHEMA='modw', action_definition_dir='/etc/xdmod/etl/etl_action_defs.d', base_dir='/etc/xdmod/etl', data_dir='/etc/xdmod/etl/etl_data.d', local_config_dir='/etc/xdmod/etl/etl.d', macro_dir='/etc/xdmod/etl/etl_macros.d', schema_dir='/etc/xdmod/etl/etl_schemas.d', sql_dir='/etc/xdmod/etl/etl_sql.d', table_definition_dir='/etc/xdmod/etl/etl_tables.d'
2023-04-29 13:45:49 [debug] ETL\Ingestor\pdoIngestor::transform() overriden by ETL\Ingestor\CloudStateReconstructorTransformIngestor::transform()
2023-04-29 13:45:49 [debug] Using multi-database ingest
2023-04-29 13:45:49 [debug] Using temporary file '/tmp/modw_cloud.event_reconstructed.data.ts_1682775949.114201639119VqyP' for destination table key 'event_reconstructed'
2023-04-29 13:45:49 [debug] LOAD statement for destination table key 'event_reconstructed' ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod):
CREATE TABLE
modw_cloud
.tmp_event_reconstructed_1682775949
LIKEmodw_cloud
.event_reconstructed
; ALTER TABLEmodw_cloud
.tmp_event_reconstructed_1682775949
DISABLE KEYS; LOAD DATA LOCAL INFILE '/tmp/modw_cloud.event_reconstructed.data.ts_1682775949.114201639119VqyP' INTO TABLEmodw_cloud
.tmp_event_reconstructed_1682775949
FIELDS TERMINATED BY 0x1e OPTIONALLY ENCLOSED BY 0x1f ESCAPED BY 0x5c LINES TERMINATED BY 0x1d (resource_id
,instance_id
,start_time_ts
,start_event_id
,end_time_ts
,end_event_id
); SHOW WARNINGS; INSERT INTOmodw_cloud
.event_reconstructed
(resource_id
,instance_id
,start_time_ts
,start_event_id
,end_time_ts
,end_event_id
) SELECTresource_id
,instance_id
,start_time_ts
,start_event_id
,end_time_ts
,end_event_id
FROMmodw_cloud
.tmp_event_reconstructed_1682775949
ON DUPLICATE KEY UPDATEresource_id
=VALUES(resource_id
),instance_id
=VALUES(instance_id
),start_time_ts
=VALUES(start_time_ts
),start_event_id
=VALUES(start_event_id
),end_time_ts
=VALUES(end_time_ts
),end_event_id
=VALUES(end_event_id
); DROP TABLEmodw_cloud
.tmp_event_reconstructed_1682775949
;2023-04-29 13:45:49 [info] Multi-database ingest into ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:49 [info] Switching to un-buffered query mode
2023-04-29 13:45:51 [debug] Current net_write_timeout = 60
2023-04-29 13:45:51 [info] ETL\Ingestor\CloudStateReconstructorTransformIngestor: Querying ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:45:52 [debug] Source query ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod):
SELECT
e.resource_id AS
resource_id
,e.instance_id AS
instance_id
,event_time_ts AS
event_time_ts
,event_type_id AS
event_type_id
,-1 AS
start_time_ts
,-1 AS
start_event_id
,-1 AS
end_time_ts
,-1 AS
end_event_id
FROM
modw_cloud
.event
AS e WHERE instance_id IN (SELECT DISTINCT instance_id from modw_cloud.event WHERE last_modified > "2018-01-01 12:30:00") AND event_type_id IN (1,2,3,4,6,5,7,8,17,19,20,44,45,54,55,56,57,58,59,60,61,62,63,64,16)UNION ALL
SELECT 0,0,0,0,0,0,0,0
ORDER BY 1 DESC, 2 DESC, 3 ASC, 4 DESC
2023-04-29 13:46:08 [debug] Loaded 2,393 records into 'event_reconstructed'
2023-04-29 13:46:08 [debug] Loaded 1 files in 0s
2023-04-29 13:46:08 [info] ETL\Ingestor\CloudStateReconstructorTransformIngestor: Processed 2,393 records (634,912 source records) in 19s
2023-04-29 13:46:08 [info] Returning buffered query mode to: true
2023-04-29 13:46:08 [info] Execute Post-execute tasks: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:46:08 [debug] ANALYZE TABLE
modw_cloud
.event_reconstructed
2023-04-29 13:46:08 [debug] Completed in 0.041930s
2023-04-29 13:46:08 [info] ETL\Ingestor\CloudStateReconstructorTransformIngestor: Rows Processed: 2393 records (Time Taken: 23.46 s)
2023-04-29 13:46:08 [notice] (action: xdmod.cloud-state-pipeline.cloud-state-action (ETL\Ingestor\CloudStateReconstructorTransformIngestor), start_time: 1682775945.3225, end_time: 1682775968.782, elapsed_time: 23.45956, records_examined: 2393, records_loaded: 2393)
2023-04-29 13:46:09 [info] end (action_name: xdmod.cloud-state-pipeline.cloud-state-action, action: xdmod.cloud-state-pipeline.cloud-state-action (ETL\Ingestor\CloudStateReconstructorTransformIngestor))
2023-04-29 13:46:09 [info] start (action_name: xdmod.cloud-state-pipeline.delete-session-records, action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql), start_date: , end_date: )
2023-04-29 13:46:09 [notice] Processing SQL file '/etc/xdmod/etl/etl_sql.d/cloud_common/delete_session_records.sql' using delimiter '//' containing 1 statements
2023-04-29 13:46:09 [info] Executing statement ( 1 / 1) (action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql)-sql-1, endpoint: ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod), sql: DELETE sr FROM
modw_cloud.session_records AS sr
JOIN
(SELECT DISTINCT instance_id FROM modw_cloud.event_reconstructed) AS ev ON ev.instance_id = sr.instance_id;)
2023-04-29 13:46:10 [info] Finished executing statement ( 1 / 1) (action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql)-sql-1, rows: 1624, start_time: 1682775969.6787, end_time: 1682775970.4044, elapsed_time: 0.7256760597229)
2023-04-29 13:46:10 [notice] Finished Processing 1 SQL statements
2023-04-29 13:46:10 [notice] (action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql), start_time: 1682775969.2951, end_time: 1682775970.4722, elapsed_time: 1.17711)
2023-04-29 13:46:10 [info] end (action_name: xdmod.cloud-state-pipeline.delete-session-records, action: xdmod.cloud-state-pipeline.delete-session-records (ETL\Maintenance\ExecuteSql))
2023-04-29 13:46:10 [info] start (action_name: xdmod.cloud-state-pipeline.cloud-session-records, action: xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor), start_date: , end_date: )
2023-04-29 13:46:11 [debug] Discover table 'modw_cloud.session_records'
2023-04-29 13:46:11 [debug] Column last_modified: values for "timestamp" differ ("CURRENT_TIMESTAMP on update current_timestamp" != "current_timestamp() on update current_timestamp()")
2023-04-29 13:46:11 [notice] Altering table
modw_cloud
.session_records
2023-04-29 13:46:11 [debug] Alter table SQL ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod):
ALTER TABLE
modw_cloud
.session_records
CHANGE COLUMN
last_modified
last_modified
timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP on update current_timestamp ;2023-04-29 13:46:11 [info] Process date interval (1/1) (start: none, end: none)
2023-04-29 13:46:11 [debug] Available Variables: DESTINATION_SCHEMA='modw_cloud', DW_ETL_LOG_RECIPIENT='nobody@massopen.cloud', LAST_MODIFIED='2018-01-01 12:30:00', LAST_MODIFIED_START_DATE='2018-01-01 12:30:00', SOURCE_SCHEMA='modw_cloud', TIMEZONE='UTC', UTILITY_SCHEMA='modw', action_definition_dir='/etc/xdmod/etl/etl_action_defs.d', base_dir='/etc/xdmod/etl', data_dir='/etc/xdmod/etl/etl_data.d', local_config_dir='/etc/xdmod/etl/etl.d', macro_dir='/etc/xdmod/etl/etl_macros.d', schema_dir='/etc/xdmod/etl/etl_schemas.d', sql_dir='/etc/xdmod/etl/etl_sql.d', table_definition_dir='/etc/xdmod/etl/etl_tables.d'
2023-04-29 13:46:11 [debug] Allowing same-server SQL optimizations
2023-04-29 13:46:11 [info] Single-database ingest into ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=mariadb:3306, user=xdmod)
2023-04-29 13:46:11 [debug] INSERT INTO
modw_cloud
.session_records
(
instance_id
,start_time
,start_event_type_id
,end_time
,end_event_type_id
,resource_id
,instance_type
,instance_type_id
,num_cores
,memory_mb
,processorbucket_id
,memorybucket_id
,disk_gb
,start_time_ts
,end_time_ts
,start_day_id
,end_day_id
,wallduration
,person_id
,systemaccount_id
,submission_venue_id
,domain_id
,service_provider
,account_id
,principalinvestigator_person_id
,fos_id
,host_id
)SELECT
e.instance_id AS
instance_id
,FROM_UNIXTIME(e.start_time_ts) AS
start_time
,e.start_event_id AS
start_event_type_id
,FROM_UNIXTIME(e.end_time_ts) AS
end_time
,e.end_event_id AS
end_event_type_id
,it.resource_id AS
resource_id
,itt.instance_type AS
instance_type
,itt.instance_type_id AS
instance_type_id
,itt.num_cores AS
num_cores
,itt.memory_mb AS
memory_mb
,(SELECT id FROM modw_cloud.processor_buckets pb WHERE itt.num_cores BETWEEN pb.min_processors AND pb.max_processors) AS
processorbucket_id
,(SELECT id FROM modw_cloud.memory_buckets mb WHERE itt.memory_mb BETWEEN mb.min_memory AND mb.max_memory) AS
memorybucket_id
,itt.disk_gb AS
disk_gb
,FLOOR(e.start_time_ts) AS
start_time_ts
,FLOOR(e.end_time_ts) AS
end_time_ts
,YEAR(FROM_UNIXTIME(e.start_time_ts)) * 100000 + DAYOFYEAR(FROM_UNIXTIME(e.start_time_ts)) AS
start_day_id
,YEAR(FROM_UNIXTIME(e.end_time_ts)) * 100000 + DAYOFYEAR(FROM_UNIXTIME(e.end_time_ts)) AS
end_day_id
,FLOOR(e.end_time_ts) - FLOOR(e.start_time_ts) AS
wallduration
,ev.person_id AS
person_id
,ev.systemaccount_id AS
systemaccount_id
,ev.submission_venue_id AS
submission_venue_id
,ev.domain_id AS
domain_id
,ev.service_provider AS
service_provider
,a.account_id AS
account_id
,a.principalinvestigator_person_id AS
principalinvestigator_person_id
,a.fos_id AS
fos_id
,ev.host_id AS
host_id
FROM
modw_cloud
.event_reconstructed
AS eJOIN
modw_cloud
.event
AS ev ON e.start_event_id = ev.event_type_id AND e.start_time_ts = ev.event_time_ts AND e.instance_id = ev.instance_id AND e.resource_id = ev.resource_idJOIN
modw_cloud
.instance
AS it ON e.instance_id = it.instance_id AND e.resource_id = it.resource_idJOIN
modw_cloud
.instance_data
AS itd ON itd.resource_id = it.resource_id AND itd.event_id = ev.event_idJOIN
modw_cloud
.instance_type
AS itt ON itt.resource_id = it.resource_id AND itt.instance_type_id = itd.instance_type_idLEFT JOIN
modw_cloud
.account
AS a ON it.account_id = a.account_idORDER BY resource_id asc, instance_id asc, start_time_ts asc
ON DUPLICATE KEY UPDATE
instance_id
=VALUES(instance_id
),start_time
=VALUES(start_time
),start_event_type_id
=VALUES(start_event_type_id
),end_time
=VALUES(end_time
),end_event_type_id
=VALUES(end_event_type_id
),resource_id
=VALUES(resource_id
),instance_type
=VALUES(instance_type
),instance_type_id
=VALUES(instance_type_id
),num_cores
=VALUES(num_cores
),memory_mb
=VALUES(memory_mb
),processorbucket_id
=VALUES(processorbucket_id
),memorybucket_id
=VALUES(memorybucket_id
),disk_gb
=VALUES(disk_gb
),start_time_ts
=VALUES(start_time_ts
),end_time_ts
=VALUES(end_time_ts
),start_day_id
=VALUES(start_day_id
),end_day_id
=VALUES(end_day_id
),wallduration
=VALUES(wallduration
),person_id
=VALUES(person_id
),systemaccount_id
=VALUES(systemaccount_id
),submission_venue_id
=VALUES(submission_venue_id
),domain_id
=VALUES(domain_id
),service_provider
=VALUES(service_provider
),account_id
=VALUES(account_id
),principalinvestigator_person_id
=VALUES(principalinvestigator_person_id
),fos_id
=VALUES(fos_id
),host_id
=VALUES(host_id
)2023-04-29 13:46:17 [error] {"message":"xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor): SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'processorbucket_id' cannot be null Exception: 'SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'processorbucket_id' cannot be null'"}
2023-04-29 13:46:17 [warning] Stopping ETL due to exception in xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor)
2023-04-29 13:46:18 [info] Releasing lock file '/tmp/etlv2_416'
2023-04-29 13:46:19 [critical] Aggregation failed: xdmod.cloud-state-pipeline.cloud-session-records (ETL\Ingestor\DatabaseIngestor): SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'processorbucket_id' cannot be null Exception: 'SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'processorbucket_id' cannot be null'
(stacktrace:
0 /usr/share/xdmod/classes/ETL/Ingestor/pdoIngestor.php(544): CCR\Loggable->logAndThrowException('SQLSTATE[23000]...', Array)
1 /usr/share/xdmod/classes/ETL/Ingestor/pdoIngestor.php(459): ETL\Ingestor\pdoIngestor->singleDatabaseIngest()
2 /usr/share/xdmod/classes/ETL/Ingestor/aIngestor.php(126): ETL\Ingestor\pdoIngestor->_execute()
3 /usr/share/xdmod/classes/ETL/EtlOverseer.php(473): ETL\Ingestor\aIngestor->execute(Object(ETL\EtlOverseerOptions))
4 /usr/share/xdmod/classes/ETL/EtlOverseer.php(435): ETL\EtlOverseer->_execute('xdmod.cloud-sta...', Object(ETL\Ingestor\DatabaseIngestor))
5 /usr/share/xdmod/classes/ETL/Utilities.php(281): ETL\EtlOverseer->execute(Object(ETL\Configuration\EtlConfiguration))
6 /usr/share/xdmod/classes/OpenXdmod/DataWarehouseInitializer.php(362): ETL\Utilities::runEtlPipeline(Array, Object(CCR\Logger), Array)
7 /usr/bin/xdmod-ingestor(310): OpenXdmod\DataWarehouseInitializer->aggregateCloudData('2018-01-01 12:3...')
8 /usr/bin/xdmod-ingestor(21): main()
9 {main})
Gregary Dean , said 4 months ago Cc: joachimw@bu.edu, msd@bu.edu Ticket: https://help.xdmod.org/support/tickets/32768
Hi Rob,
This error would happen if you have a VM whose number of processors is outside of the ranges that exists in the modw_cloud.processor_buckets table. Can you run "SELECT * FROM modw_cloud.processor_buckets" and see what it returns? This is the default for the table:
+----+----------------+----------------+-------------+ | id | min_processors | max_processors | description | +----+----------------+----------------+-------------+ | 1 | 1 | 1 | 1 | | 2 | 2 | 3 | 2 - 3 | | 3 | 4 | 7 | 4 - 7 | | 4 | 8 | 11 | 8 - 11 | | 5 | 12 | 15 | 12 - 15 | | 6 | 16 | 32 | 16 - 32 | | 7 | 33 | 2147483647 | > 32 | +----+----------------+----------------+-------------+
Basically, the number of CPUs your VM's are reporting should be between the minimum value in min_processors and the maximum value in max_processors.
-greg Robert Bartlett Baron , said 4 months ago greg,
That sql gives the default:
MariaDB [(none)]> SELECT * FROM modw_cloud.processor_buckets -> ; +----+----------------+----------------+-------------+ | id | min_processors | max_processors | description | +----+----------------+----------------+-------------+ | 1 | 1 | 1 | 1 | | 2 | 2 | 3 | 2 - 3 | | 3 | 4 | 7 | 4 - 7 | | 4 | 8 | 11 | 8 - 11 | | 5 | 12 | 15 | 12 - 15 | | 6 | 16 | 32 | 16 - 32 | | 7 | 33 | 2147483647 | > 32 | +----+----------------+----------------+-------------+ 7 rows in set (0.01 sec)
Is there an easy way to find the records that may have the number of processors that, say, zero to 1? Gregary Dean , said 4 months ago Cc: joachimw@bu.edu, msd@bu.edu Ticket: https://help.xdmod.org/support/tickets/32768
Hi Rob,
Here are two queries that can help find that information. The first finds any openstack flavor whose number of CPU's is less than 1. This will not get you the individual VM's, just the Openstack flavors. SELECT * FROM modw_cloud.instance_type WHERE num_cores < 1 AND resource_id != -1
This second one will get you both the UUID and Openstack flavor for a VM. SELECT i.provider_identifier, itt.display FROM event AS ev JOIN instance AS i ON ev.instance_id = i.instance_id JOIN instance_data AS itd ON itd.resource_id = ev.resource_id AND itd.event_id = ev.event_id JOIN instance_type AS itt ON itt.instance_type_id = itd.instance_type_id WHERE ev.event_type_id IN (1,7,57,56,58,60,63,2,8,16,20,61,59,4,3,5,44,54,62,64,6,17,19,45,55) AND itt.num_cores < 1 GROUP BY 1 ORDER BY 2 DESC
One of these should help with finding the issue.
-greg Robert Bartlett Baron , said 3 months ago Interestingly enough the result of the first query turned up 8 such flavors:"; +-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+ | resource_id | instance_type_id | instance_type | display | description | num_cores | memory_mb | disk_gb | start_time | end_time | +-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+ | 2 | 13 | unknown flavor(0cdf836c-e0a6-48ba-b98d-e8ed7e313acf) | unknown flavor(0cdf836c-e0a6-48ba-b98d-e8ed7e313acf) | NULL | 0 | 0 | 0 | 1644361873.000000 | NULL | | 2 | 14 | unknown flavor(1202a547-f2d7-4f8c-bb2b-e55bcce24f0f) | unknown flavor(1202a547-f2d7-4f8c-bb2b-e55bcce24f0f) | NULL | 0 | 0 | 0 | 1649795786.000000 | NULL | | 2 | 15 | unknown flavor(255ae564-1164-43eb-b3da-13a115f4c722) | unknown flavor(255ae564-1164-43eb-b3da-13a115f4c722) | NULL | 0 | 0 | 0 | 1644526590.000000 | NULL | | 2 | 16 | unknown flavor(3dfd915f-4730-434f-aec9-f1fb738423fd) | unknown flavor(3dfd915f-4730-434f-aec9-f1fb738423fd) | NULL | 0 | 0 | 0 | 1646354299.000000 | NULL | | 2 | 17 | unknown flavor(8dcbf98b-412c-44b7-a56a-cb8dbec11d4b) | unknown flavor(8dcbf98b-412c-44b7-a56a-cb8dbec11d4b) | NULL | 0 | 0 | 0 | 1647996018.000000 | NULL | | 2 | 18 | unknown flavor(d3652ccd-713e-40a0-83b6-78888bf10015) | unknown flavor(d3652ccd-713e-40a0-83b6-78888bf10015) | NULL | 0 | 0 | 0 | 1647960881.000000 | NULL | | 2 | 19 | unknown flavor(e2ec72ca-d9a7-4be8-991b-6c94e109ff95) | unknown flavor(e2ec72ca-d9a7-4be8-991b-6c94e109ff95) | NULL | 0 | 0 | 0 | 1644361910.000000 | NULL | | 2 | 20 | unknown flavor(f8a78508-9dc5-4374-8f12-4b3ea0037991) | unknown flavor(f8a78508-9dc5-4374-8f12-4b3ea0037991) | NULL | 0 | 0 | 0 | 1646836326.000000 | NULL | +-------------+------------------+------------------------------------------------------+------------------------------------------------------+-------------+-----------+-----------+---------+-------------------+----------+ 8 rows in set (0.03 sec) Also, from the second query, I see that data is continuing to arrive from VMs that use one of the 8 unknown flavors.
I tried setting those 8 flavors to having 1 cpu, and that failed spectacularly, in that the next time the shredder and ingestor ran, I suddenly had duplicates of those 8 unknown flavors, with the new ones have 0 as the cpu.
So just to bounce the idea off of you, I am planning to do the following:
0) backup the DB 1) change the code that pulls the data to place a 1s in the cpu, memory, and disk 2) have other people resize their VMs to defined flavors 3) work through the shredding tables to add 1's to the flavors 4) reinvest the cloud data
Am I forgetting anything? Gregary Dean , said 3 months ago Cc: joachimw@bu.edu, msd@bu.edu Ticket: https://help.xdmod.org/support/tickets/32768
Hi Rob,
I'm sorry for the delay in answering. I was on vacation for the past week. Those steps sound fine. If you are planning on re-ingesting the data from your log files I would suggest removing the offending instances types from the instance type table and any records from other tables that reference these instance types before re-ingesting. Here are two queries that should help with doing that.
DELETE ev, i, itd, itt FROM event AS ev JOIN instance AS i ON ev.instance_id = i.instance_id JOIN instance_data AS itd ON itd.resource_id = ev.resource_id and itd.event_id = ev.event_id JOIN instance_type AS itt ON itt.instance_type_id = itd.instance_type_id WHERE ev.event_type_id IN (1,7,57,56,58,60,63,2,8,16,20,61,59,4,3,5,44,54,62,64,6,17,19,45,55) AND itt.num_cores < 1;
DELETE sr FROM modw_cloud.session_records AS sr WHERE sr.instance_type_id IN (13,14,15,16,17,18,19,20)
-greg Robert Bartlett Baron , said 2 months ago I'm not reinvesting things from the log files as I would have to correct the log files.
I've gone through the modw_cloud database and moved the references to flavors named with "undefined flavor ..." to one with 1 cpu, 1 ram and 1 disk.
I then rerun the xdmod-ingestor and the script runs to completion including the aggregation.
Do I need to delete the modw_cloud.session_records?
Rob. Robert Bartlett Baron , said about 2 months ago I think I have figured out this issue, we can close this one for now. Robert Bartlett Baron