Closed tommyJin closed 7 years ago
Hi @hseipp ,
Do you have any idea why this would happen?
Best regards, Tommy
Hi Tommy,
Harald is on vacation, let me answer your questions:
Can you please check if the new requests get rejected due to wrong token? In your command you used token' instead of $token. Also to better understand which logs are before/after your new migrate command, could you also run 'requests' query (lists submitted requests) immediately after your migrate request, and show the output: curl -v -X POST -H 'X-Storage-Token: $token' http://127.0.0.1:8080/hlm/v1/migrate/AUTH_test/testcontainer1/testfile3 curl -v -X POST -H 'X-Storage-Token: $token' http://127.0.0.1:8080/hlm/v1/requests/AUTH_test/testcontainer1/testfile3
That would require change in the middleware code. Removing ", username="swift" would result in using the current process user: middleware.py(line 557), ssh_client.connect(ip_addr)
If you want to try that then also make sure key based ssh is setup for the user (as in steps 4.2.1 and 4.2.2 or README.txt), and that the same user is configured in /etc/systemd/system/swifthlm.dispatcher.service.
Regards, Slavisa
Hi @slavisasarafijanovic ,
Thanks for your reply :)
Sorry for the abbreviation of $token, I did input the newest token for further requests. In addition, I find that if I don't modify line 227 from self.logger = get_logger(conf, name='hlm-middleware',
to self.logger = get_logger(conf, name='hlm-dispatcher',
, logs from middleware will not go into the /var/log/swift/hlm.log file.
I remove the ", username='swift'" and reinstall the swifthlm, and modify the 'User=swift' to 'User=osddev' in /etc/systemd/system/swifthlm.dispatcher.service, and run
ssh-keygen
to overwrite the old one and run
cp /home/osddev/.ssh/id_rsa.pub /home/osddev/.ssh/authorized_keys
and now the files under /home/osddev/.ssh are
[osddev@localhost .ssh]$ ls authorized_keys id_rsa id_rsa.pub known_hosts
Currently, I use dummy connector instead of TLC simulator due to lack of dependencies.
delete all objects and directories
[root@localhost objects]# rm -rf /srv/1/node/sdb1/objects/; rm -rf /srv/2/node/sdb2/objects/; rm -rf /srv/3/node/sdb3/objects/; rm -rf /srv/4/node/sdb4/objects/; [root@localhost objects]# cd /srv/1/node/sdb1/objects; ls [root@localhost objects]# cd /srv/2/node/sdb2/objects; ls [root@localhost objects]# cd /srv/3/node/sdb3/objects; ls [root@localhost objects]# cd /srv/4/node/sdb4/objects; ls
upload file
[root@localhost testresource]# curl -v -X PUT -H 'X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8' http://127.0.0.1:8080/v1/AUTH_test/testcontainer1/testfile3 -T /home/osddev/testfile.txt
- About to connect() to 127.0.0.1 port 8080 (#0)
- Trying 127.0.0.1...
- Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
PUT /v1/AUTH_test/testcontainer1/testfile3 HTTP/1.1 User-Agent: curl/7.29.0 Host: 127.0.0.1:8080 Accept: / X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8 Content-Length: 12 Expect: 100-continue
< HTTP/1.1 100 Continue
- We are completely uploaded and fine < HTTP/1.1 201 Created < Last-Modified: Wed, 05 Jul 2017 08:23:06 GMT < Content-Length: 0 < Etag: 6f5902ac237024bdd0c176cb93063dc4 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx961883f1f7244433859f0-00595ca1e9 < Date: Wed, 05 Jul 2017 08:23:05 GMT <
- Connection #0 to host 127.0.0.1 left intact
list all files under objects before executing 'migrate'
[root@localhost objects]# cd /srv/4/node/sdb4/objects; ls [root@localhost objects]# cd /srv/3/node/sdb3/objects; ls 46 [root@localhost objects]# cd /srv/2/node/sdb2/objects; ls [root@localhost objects]# cd /srv/1/node/sdb1/objects; ls 46
Run 'requests' before 'migrate'
[root@localhost objects]# curl -v -X GET -H 'X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8' http://127.0.0.1:8080/hlm/v1/requests/AUTH_test/testcontainer1/testfile3
- About to connect() to 127.0.0.1 port 8080 (#0)
- Trying 127.0.0.1...
- Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
GET /hlm/v1/requests/AUTH_test/testcontainer1/testfile3 HTTP/1.1 User-Agent: curl/7.29.0 Host: 127.0.0.1:8080 Accept: / X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8
< HTTP/1.1 200 OK < Content-Length: 880 < Content-Type: text/plain < X-Trans-Id: tx46c6aa81bbe142249efc7-00595ca24b < Date: Wed, 05 Jul 2017 08:24:44 GMT <
- Connection #0 to host 127.0.0.1 left intact ["20170703064113.153--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703065557.679--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703065717.215--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075041.909--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075107.500--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075124.025--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703100804.066--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705035315.145--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705053853.445--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705054001.232--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705054259.735--migrate--AUTH_test--testcontainer1--0--testfile3--failed"][root@localhost objects]#
Run 'migrate' and 'requests'
[root@localhost objects]# curl -v -X POST -H 'X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8' http://127.0.0.1:8080/hlm/v1/migrate/AUTH_test/testcontainer1/testfile3
- About to connect() to 127.0.0.1 port 8080 (#0)
- Trying 127.0.0.1...
- Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
POST /hlm/v1/migrate/AUTH_test/testcontainer1/testfile3 HTTP/1.1 User-Agent: curl/7.29.0 Host: 127.0.0.1:8080 Accept: / X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8
< HTTP/1.1 200 OK < Content-Length: 26 < Content-Type: text/plain < X-Trans-Id: tx7fa74f58c1d5434891860-00595ca2a3 < Date: Wed, 05 Jul 2017 08:26:11 GMT < Accepted migrate request.
- Connection #0 to host 127.0.0.1 left intact [root@localhost objects]# curl -v -X GET -H 'X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8' http://127.0.0.1:8080/hlm/v1/requests/AUTH_test/testcontainer1/testfile3
- About to connect() to 127.0.0.1 port 8080 (#0)
- Trying 127.0.0.1...
- Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
GET /hlm/v1/requests/AUTH_test/testcontainer1/testfile3 HTTP/1.1 User-Agent: curl/7.29.0 Host: 127.0.0.1:8080 Accept: / X-Auth-Token: AUTH_tk8381f9f1c5964bc79423e8f87f188ca8
< HTTP/1.1 200 OK < Content-Length: 1041 < Content-Type: text/plain < X-Trans-Id: tx8d51451755a34fbd98ecb-00595ca2a8 < Date: Wed, 05 Jul 2017 08:26:16 GMT < ["20170705082611.656--migrate--AUTH_test--testcontainer1--0--testfile3--pending", "20170703064113.153--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703065557.679--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703065717.215--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075041.909--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075107.500--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703075124.025--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170703100804.066--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705035315.145--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705053853.445--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705054001.232--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705054259.735--migrate--AUTH_test--testcontainer1--0--testfile3--failed", "20170705082602.466--migrate--AUTH_test--testcontainer1--0--te* Connection #0 to host 127.0.0.1 left intact stfile3--failed"][root@localhost objects]#
from the output we could see that some older requests have not been handled like "20170703064113.153--migrate--AUTH_test--testcontainer1--0--testfile3--failed".
list all files under /objects after executing migrate
[root@localhost objects]# cd /srv/4/node/sdb4/objects; ls [root@localhost objects]# cd /srv/3/node/sdb3/objects; ls 142 39 46 53 776 [root@localhost objects]# cd /srv/2/node/sdb2/objects; ls 142 39 53 776 [root@localhost objects]# cd /srv/1/node/sdb1/objects; ls 142 39 46 [root@localhost objects]#
And I check out the .db file under /srv/1/node/sdb1/containers/663/76d/a5edc6b2f1d47145c012dee12f60876d/ named a5edc6b2f1d47145c012dee12f60876d.db and view the tables and I find records in table 'object' below:
ROWID name created_at size content_type etag deleted storage_policy_index 23 20170613061632.069--migrate--AUTH_test--container1--0 1498733644.78429 0 application/deleted noetag 1 0 24 20170613061745.698--recall--AUTH_test--container1--0 1498806380.70219 0 application/deleted noetag 1 0 25 20170613170807.015--migrate--AUTH_test--container1--0 1498809922.97080 0 application/deleted noetag 1 0 26 20170614033858.964--migrate--AUTH_test--sofContainer--0 1498810769.39533 0 application/deleted noetag 1 0 28 20170628020359.579--migrate--AUTH_test--sofContainer--0 1498812757.51289 0 application/deleted noetag 1 0 29 20170630082827.167--migrate--AUTH_test--testcontainer1--0--testfile 1498813270.77409 0 application/deleted noetag 1 0 31 20170628034426.764--migrate--AUTH_test--container1--0 1499049591.41261 0 application/deleted noetag 1 0 33 20170628034606.025--migrate--AUTH_test--container1--0 1499050306.93780 0 application/deleted noetag 1 0 35 20170628034616.500--migrate--AUTH_test--container1--0 1499062413.74734 0 application/deleted noetag 1 0 37 20170628053149.246--migrate--AUTH_test--container1--0 1499062653.96940 0 application/deleted noetag 1 0 39 20170628113029.246--migrate--AUTH_test--sofContainer--0 1499062789.80359 0 application/deleted noetag 1 0 41 20170629013549.762--migrate--AUTH_test--sofContainer--0 1499063355.95216 0 application/deleted noetag 1 0 43 20170629013804.754--migrate--AUTH_test--sofContainer--0 1499063462.72824 0 application/deleted noetag 1 0 44 20170629014706.860--migrate--AUTH_test--sofContainer--0 1499063574.93438 0 application/deleted noetag 1 0 46 20170629023514.918--migrate--AUTH_test--sofContainer--0 1499064930.09461 0 application/deleted noetag 1 0 52 20170629024143.265--migrate--AUTH_test--sofContainer--0 1499068529.61031 0 application/deleted noetag 1 0 53 20170629060705.303--migrate--AUTH_test--sofContainer--0 1499071963.70294 0 application/deleted noetag 1 0 55 20170629080249.234--migrate--AUTH_test--sofContainer--0--testfile 1499226301.85943 0 application/deleted noetag 1 0 62 20170629080601.608--migrate--AUTH_test--sofContainer--0--testfile 1499239244.77141 0 application/deleted noetag 1 0 63 20170629083012.143--migrate--AUTH_test--sofContainer--0--testfile 1499240531.14688 0 application/deleted noetag 1 0 64 20170629091501.338--migrate--AUTH_test--testcontainer--0--testfile 1499241468.00168 0 application/deleted noetag 1 0 65 20170629095519.623--migrate--AUTH_test--testcontainer--0--testfile1 1499241710.74410 0 application/deleted noetag 1 0 66 20170629095528.003--migrate--AUTH_test--testcontainer--0--testfile1 1499241713.85864 0 application/deleted noetag 1 0 67 20170630091430.232--migrate--AUTH_test--testcontainer1--0--testfile1 1499241714.80492 0 application/deleted noetag 1 0 68 20170703024511.252--migrate--AUTH_test--testcontainer1--0--testfile2 1499241716.13648 0 application/deleted noetag 1 0 69 20170703025226.838--migrate--AUTH_test--testcontainer1--0--testfile2 1499241717.15535 0 application/deleted noetag 1 0 70 20170703061550.748--migrate--AUTH_test--testcontainer1--0--testfile2 1499241717.94710 0 application/deleted noetag 1 0 71 20170703061741.493--migrate--AUTH_test--testcontainer1--0--testfile2 1499241718.67645 0 application/deleted noetag 1 0 72 20170703062020.472--migrate--AUTH_test--testcontainer1--0--testfile2 1499241719.46636 0 application/deleted noetag 1 0 73 20170703062921.548--migrate--AUTH_test--testcontainer1--0--testfile2 1499241720.31480 0 application/deleted noetag 1 0 74 20170703065557.679--migrate--AUTH_test--testcontainer1--0--testfile3 1499241721.99651 0 application/deleted noetag 1 0 75 20170703064113.153--migrate--AUTH_test--testcontainer1--0--testfile3 1499241721.26659 0 application/deleted noetag 1 0 76 20170703065717.215--migrate--AUTH_test--testcontainer1--0--testfile3 1499241722.85521 0 application/deleted noetag 1 0 77 20170703075107.500--migrate--AUTH_test--testcontainer1--0--testfile3 1499241724.28387 0 application/deleted noetag 1 0 78 20170703075041.909--migrate--AUTH_test--testcontainer1--0--testfile3 1499241723.56999 0 application/deleted noetag 1 0 79 20170703100804.066--migrate--AUTH_test--testcontainer1--0--testfile3 1499241725.69768 0 application/deleted noetag 1 0 80 20170703075124.025--migrate--AUTH_test--testcontainer1--0--testfile3 1499241724.98917 0 application/deleted noetag 1 0 81 20170705035315.145--migrate--AUTH_test--testcontainer1--0--testfile3 1499241726.44655 0 application/deleted noetag 1 0 82 20170705053853.445--migrate--AUTH_test--testcontainer1--0--testfile3 1499241727.41514 0 application/deleted noetag 1 0 83 20170705054001.232--migrate--AUTH_test--testcontainer1--0--testfile3 1499241728.23511 0 application/deleted noetag 1 0 84 20170705055130.887--migrate--AUTH_test--testcontainer1--0--test1 1499241729.71925 0 application/deleted noetag 1 0 85 20170705054259.735--migrate--AUTH_test--testcontainer1--0--testfile3 1499241729.00254 0 application/deleted noetag 1 0 86 20170705055553.306--migrate--AUTH_test--testcontainer1--0--test1 1499241730.44220 0 application/deleted noetag 1 0 89 20170705082602.466--migrate--AUTH_test--testcontainer1--0--testfile3 1499243170.09840 0 application/deleted noetag 1 0 90 20170705082611.656--migrate--AUTH_test--testcontainer1--0--testfile3 1499243200.87371 0 application/deleted noetag 1 0
Is this output normal? The old requests seem to be stuck, but now the new requests are able to be added to the request queue now(and I don't know how, cause a few days ago 'migrate' request would only return 'accepted' but not shown in the 'requests' request)
Thanks a lot for viewing and replying my questions.
Best regards, Tommy
Hi Tommy,
The output you show seems normal. But to better understand what is going on on your test system more output seems necessary. You show the output of the "requests" HLM query only 5 sec after the migration is submitted, it might take longer to the dispatcher to pick the "migrate" HLM request from the queue, process it, and remove it from the queue. So "pending" entry at this time is expected, but it would be good to see what is the output of a "requests" HLM query at a later time. The content of the .db file suggests the new request was eventually processed and deleted (removed) from the queue (need to verify that after recreating your test scenario), and only after that the previously failed requests for the same container/object should also be removed from the queue.
As for the logs, I was getting logs from middleware into /var/log/swift/hlm.log, without the code change you did. I will try to recreate you environment (from the previous issue posts) and check if I can recreate the log behavior you observed.
Regards, Slavisa
The issue about title is done. HLM needs time to process the requests. And related settings must be set appropriately.
Log issue remains. The 'hlm.log' only stores logs from Dispatcher.py
Thanks everybody's help.
Hi there,
Basic info: python: 2.7 swift: 2.5.0(Liberty) swifthlm: 0.2.2(before 2017/06/20 submit)
The username I use is 'osddev', and I only modify some swift config files as the hlm doc tells. Policies and other settings remain the same.
I change the key-value pair 'User=swift' to 'User=osddev' in file swifthlm.dispatcher.service.
I encountered a issue that the old record always remain. I did some upload tests before and after installing HLM, so there are many folders under /srv/(1,2,3,4)/node/sdb(1,2,3,4)/objects. In order to find out where and how do HLM objects(in SWIFTHLM_PENDING_REQUESTS_CONTAINER) exist, I took some steps like below:
to delete all subfolders under objects directory.
and I got the token and uploaded a object 'testfile3', and run another command
and the log is shown below
I add some logger info myself in the source code, so maybe the line number may not be the same as the source code shows.
As we can see from the log above:
My questions are:
BTW, sometimes logs cannot be updated by program itself, which means I restart swift and swifthlm service, but new startup log and processing log are not written to the hlm.log.
Best regards, Tommy