EMCECS / ecs-sync

ecs-sync is a bulk copy utility that can move data between various systems in parallel
Apache License 2.0
61 stars 22 forks source link

Problem with downloading Complete Object Report #28

Closed FAD775544 closed 6 years ago

FAD775544 commented 6 years ago

Hi,

The Download Complete Object report function seems like it's broken in the last few releases, I get 502 Proxy error when trying to download any complete object report (see Screenshot)

capture

capture2

twincitiesguy commented 6 years ago

You may want to check the UI log file (/var/log/ecs-sync/ecs-sync-ui.log). It looks like there was some kind of error generating the report.

Keep in mind the all-object and error reports are only available while a job is still active (before it is archived) and only if the job has a database table. If you did not specify a database table, but started the job through the UI, it will have a temporary table which should suffice for these reports. However, be aware that temporary tables are deleted when a job is archived.

FAD775544 commented 6 years ago

Hi,

I had a look in /var/log/ecs-sync/ecs-sync-ui.log and there is no error at all at the time I download the report?

The job also definitely has a database table but there must be something wrong with the database table as I reran the job with a new database table and the download complete object report works but fails with the old one.

I also created the database table manually before I ran the first ecs-sync which I now realize is not required, maybe that has something to do with it.

twincitiesguy commented 6 years ago

In that case, maybe the error was actually coming from the ecs-sync service (check /var/log/ecs-sync/ecs-sync.log).

We always recommend letting ecs-sync create the database table for you, as the schema may change over time. There is an up-to-date creation script included with each release (mysql/create_status_table.sql), but people generally don't pay attention to that.

You can examine the table schemas using the mysql client and see how they differ. Perhaps the table you created is missing a column.

[ecssync@localhost ~]$ mysql -p ecs_sync
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 27
Server version: 5.5.56-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [ecs_sync]> desc my_sync_table;
+-------------------+---------------+------+-----+---------+-------+
| Field             | Type          | Null | Key | Default | Extra |
+-------------------+---------------+------+-----+---------+-------+
| source_id         | varchar(750)  | NO   | PRI | NULL    |       |
| target_id         | varchar(1500) | YES  |     | NULL    |       |
| is_directory      | int(11)       | NO   |     | NULL    |       |
| size              | bigint(20)    | YES  |     | NULL    |       |
| mtime             | datetime      | YES  |     | NULL    |       |
| status            | varchar(32)   | NO   | MUL | NULL    |       |
| transfer_start    | datetime      | YES  |     | NULL    |       |
| transfer_complete | datetime      | YES  |     | NULL    |       |
| verify_start      | datetime      | YES  |     | NULL    |       |
| verify_complete   | datetime      | YES  |     | NULL    |       |
| retry_count       | int(11)       | YES  |     | NULL    |       |
| error_message     | varchar(2048) | YES  |     | NULL    |       |
| is_source_deleted | int(11)       | YES  |     | NULL    |       |
+-------------------+---------------+------+-----+---------+-------+
13 rows in set (0.02 sec)
FAD775544 commented 6 years ago

The problem table looks like the following, it has all the fields but some of the lengths are different. I will let the table auto create from now on and that should resolve the problem.

+-------------------+---------------+------+-----+---------+-------+
| Field             | Type          | Null | Key | Default | Extra |
+-------------------+---------------+------+-----+---------+-------+
| source_id         | varchar(128)  | NO   | PRI | NULL    |       |
| target_id         | varchar(128)  | YES  |     | NULL    |       |
| is_directory      | int(11)       | NO   |     | NULL    |       |
| size              | bigint(20)    | YES  |     | NULL    |       |
| mtime             | datetime      | YES  |     | NULL    |       |
| status            | varchar(32)   | NO   | MUL | NULL    |       |
| transfer_start    | datetime      | YES  |     | NULL    |       |
| transfer_complete | datetime      | YES  |     | NULL    |       |
| verify_start      | datetime      | YES  |     | NULL    |       |
| verify_complete   | datetime      | YES  |     | NULL    |       |
| retry_count       | int(11)       | YES  |     | NULL    |       |
| error_message     | varchar(2048) | YES  |     | NULL    |       |
| is_source_deleted | int(11)       | YES  |     | NULL    |       |
+-------------------+---------------+------+-----+---------+-------+
13 rows in set (0.00 sec)

Also all of the logs in /var/log/ecs-sync did not contain any errors at the time of the issue which is weird.

FAD775544 commented 6 years ago

OK I don't think it was a problem with the fields in the table I think is was a problem with the size of the table. I reran the full job again and it transferred 16.5 million clips which took about 5 days but when I try to download the complete object report again it is throwing the same error.

I think the reason I continued to get the same error previously was because of the size of the table, when I tried with a much smaller clip list the object report works.

Could you please reopen the issue as this definitely isn't fixed. I am currently using ecs-sync 3.2.7.

FAD775544 commented 6 years ago

I ran two ecs-syncs from two different ecs-sync servers each copied approx 16.5 million clips they both do the same thing when I try to download the complete object report. Both syncs created the table themselves in the database.

Also once I get the error, if I go back to the main page and try to download the report again I get the same error, then after that the if I try to go back to the main page the service is not responding.

I found the following in /var/log/ecs/ecs-sync.log

EcsSync v3.2.7
2018-03-28 13:09:10 WARN  [main           ] RestServer: REST server listening at http://localhost:9200/
2018-04-03 11:03:59 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=46s98ms687µs448ns).
2018-04-03 11:04:48 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=48s244ms417µs688ns).
2018-04-03 11:08:42 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=56s791ms226µs915ns).
2018-04-03 11:13:40 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=5m8s780ms981µs77ns).
2018-04-03 11:16:26 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=3m59s532ms747µs528ns).
2018-04-03 11:21:11 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=4m9s486ms948µs16ns).
2018-04-03 11:28:03 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=5m49s75ms827µs349ns).
2018-04-03 11:31:13 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=4m30s510ms83µs521ns).
2018-04-03 11:35:51 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=3m29s308ms230µs469ns).
2018-04-03 11:39:49 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=5m45s83ms752µs822ns).
2018-04-03 11:41:18 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=2m8s336ms223µs499ns).
2018-04-03 11:45:44 WARN  [HikariPool-1 housekeeper] HikariPool: HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=3m20s809ms664µs299ns).
FAD775544 commented 6 years ago

I managed to dump the database table out manually to csv with the SQL in (mysql/csv_report.sql).

When I dumped the table manually it took about 5 minutes, maybe there is some sort of timeout in the java UI or the java ECS-SYNC service because the database table I have is so large with 16.5 million rows.

twincitiesguy commented 6 years ago

I found the problem. MySQL's JDBC driver does not enable fetching by default (it tries to download the entire result set into client memory). Seems a bit ridiculous to me, but I guess they have their reasons.

I did find a way to turn on fetching and have modified the DB service inside ecs-sync to do so. This will be available in the next release after 3.2.7.

twincitiesguy commented 6 years ago

Fixed in 3.2.8