cskoglun / ciscodnacbackupctl

Other
15 stars 2 forks source link

Purge only Deletes one Backup and then exit with Error #14

Closed T185 closed 1 year ago

T185 commented 2 years ago

Dear all,

i hope, you can help me :)

We have an DNA Center with Version 2.2.3.5 running on a Cluster of 3 Server.

After an DNA Update, we want to purge all incompatible Backups with this command: ciscodnacbackupctl purge --incompatible

After that, i see an list of all Incompatible Backups and after typing "y" it should start purging.

Sometimes i get messages like this:

Warning: Confirm if you want to delete these backups (y/n): y
Deleting... (this could take a while - as it's synchronous API calls)
Deleted backup (backup_id='953f5f30-ab06-4027-80f8-0a5ee7f736fc')

But soon after this message, i get this error:

  File "/usr/local/bin/ciscodnacbackupctl", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/cli.py", line 280, in purge
    purge = cli.purge(keep=keep, incompatible=incompatible, force=force)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 554, in purge
    self.delete(backup_id)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 340, in delete
    task(id)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 327, in task
    data = self.api._request(type="delete", url=url)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 224, in _request
    "Error: ({})".format(data["response"].get("error", "Not Available"))
KeyError: 'response'

In the End, this named backup will be deleted and is not viewable any more. But the other 20 Backups are still there and will not be deleted.

From my point of view, the Purge Command will only delete the first backup from list "history", then exits with the error message above. If i reenter the command, the next backup will be deleted and so on.

Do you have the same issue or can give me an hint, where to lock at ?

Kind Regards

robertcsapo commented 2 years ago

Seems like the API doesn't respond with response key field https://github.com/cskoglun/ciscodnacbackupctl/blob/620ddd4b78fea0d8c666fd0cea52654a9b147bc3/ciscodnacbackupctl/__init__.py#L224

Could you run the URI in Postman or curl, to see what the response is? DELETE /api/system/v1/maglev/backup/<id>

T185 commented 2 years ago

Hi Robert,

sorry for the late response, i was on vacation.

Here is the output of curl:

curl --location --request DELETE 'https://****/api/system/v1/maglev/backup/904f48d5-291f-46d1-ae8b-9279db0e7465' \
> --header 'x-auth-token: *****'

{"response": {"message": "Deleted backup (backup_id='904f48d5-291f-46d1-ae8b-9279db0e7465')", "status": "ok"}, "version": "1.5.1"}

The first backup will also be deleted from your skript, but not the other ones...

Kind regards Marcel

T185 commented 2 years ago

Hi all,

I have done a few more tests with CURL. I have deleted some of the backups with the CURL command without any issues.

But after the 5th or 6th command, i receive this error message: The request to the backend service is timing out. Please refer to the backend service's logs for more details.

As far as i know, this error message is send from the DNA Center - do you know where to lock at DNA Center or is it better to open a TAC Case ?

Kind regards

robertcsapo commented 2 years ago

Hey @T185 i've added debug flag in this release. This would help you to debug the script, instead of doing curl https://pypi.org/project/ciscodnacbackupctl/0.2.10/ pip install ciscodnacbackupctl==0.2.10

example ciscodnacbackupctl --debug whoami ciscodnacbackupctl --debug delete --backup_id a0f0f253-d1bc-4126-aa57-e7e9829c5c99 ciscodnacbackupctl --debug list

Regarding the issue, is this error also reflect in the Web UI?

T185 commented 2 years ago

Hi Robert,

i have done the debug steps:

[root@backup ~]# ciscodnacbackupctl --debug delete --backup_id 21a0fa96-3c1b-423f-96a9-c0caf8c0d0f8
Debug mode: True
2022-06-22 09:39:45,521 INFO: Cisco DNA Center Authentication (https://HOSTNAME/dna/system/api/v1/auth/token)
2022-06-22 09:39:45,522 DEBUG: Starting new HTTPS connection (1): HOSTNAME:443
send: b'POST /dna/system/api/v1/auth/token HTTP/1.1\r\nHost: HOSTNAME\r\nUser-Agent: python-requests/2.27.1\r\nAccept-Encoding: gzip, deflate\r\nAccept: application/json\r\nConnection: keep-alive\r\nContent-Type: application/json\r\nContent-Length: 0\r\nAuthorization: Basic U0FfY3MwMTNfRE5BLUJja19QOmMoMCk4dWorKDM3K0QyOCpXKEYr\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Content-Type: application/json
header: Content-Length: 744
header: Connection: keep-alive
header: Date: Wed, 22 Jun 2022 07:39:45 GMT
header: X-Password-Expiry-Days: -1, -1, -1
header: Server: webserver
header: x-request-id: 47f79b28d73bd3bc09170fe327cf545f
header: Vary: Origin
header: Access-Control-Allow-Origin: HOSTNAME
header: Via: api-gateway
header: Cache-Control: no-store
header: Pragma: no-cache
header: Content-Security-Policy: default-src 'self' 'unsafe-inline' 'unsafe-eval' blob: data:
header: X-Content-Type-Options: nosniff
header: X-XSS-Protection: 1
header: Strict-Transport-Security: max-age=31536000; includeSubDomains
header: X-Frame-Options: SAMEORIGIN
2022-06-22 09:39:45,860 DEBUG: https://HOSTNAME:443 "POST /dna/system/api/v1/auth/token HTTP/1.1" 200 744
2022-06-22 09:39:45,862 DEBUG: {"Token":"TOKEN"}
2022-06-22 09:39:45,864 DEBUG: Starting new HTTPS connection (1): HOSTNAME:443
send: b'DELETE /api/system/v1/maglev/backup/21a0fa96-3c1b-423f-96a9-c0caf8c0d0f8 HTTP/1.1\r\nHost: HOSTNAME\r\nUser-Agent: python-requests/2.27.1\r\nAccept-Encoding: gzip, deflate\r\nAccept: application/json\r\nConnection: keep-alive\r\nX-Auth-Token: TOKEN\r\nContent-Type: application/json\r\nContent-Length: 0\r\n\r\n'
reply: 'HTTP/1.1 504 Gateway Time-out\r\n'
header: Date: Wed, 22 Jun 2022 07:40:45 GMT
header: Content-Type: application/json
header: Transfer-Encoding: chunked
header: Connection: keep-alive
header: x-request-id: 06cbe4030ab6b16a3d97fffed4d290d8
header: Vary: Origin
header: Access-Control-Allow-Origin: HOSTNAME
header: Server: kong/0.14.1
header: Cache-Control: no-store
header: Pragma: no-cache
header: Content-Security-Policy: default-src 'self' 'unsafe-inline' 'unsafe-eval' blob: data:
header: X-Content-Type-Options: nosniff
header: X-XSS-Protection: 1
header: Strict-Transport-Security: max-age=31536000; includeSubDomains
header: X-Frame-Options: SAMEORIGIN
2022-06-22 09:40:45,953 DEBUG: https://HOSTNAME:443 "DELETE /api/system/v1/maglev/backup/21a0fa96-3c1b-423f-96a9-c0caf8c0d0f8 HTTP/1.1" 504 None
2022-06-22 09:40:45,955 DEBUG: {"message":"The request to the backend service is timing out. Please refer to the backend service's logs for more details."}

Traceback (most recent call last):
  File "/usr/local/bin/ciscodnacbackupctl", line 8, in <module>
    sys.exit(entry())
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/cli.py", line 367, in entry
    cli(obj={})
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/cli.py", line 219, in delete
    cli.delete(id)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 360, in delete
    task(id)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 344, in task
    data = self.api._request(type="delete", url=url)
  File "/usr/local/lib/python3.9/site-packages/ciscodnacbackupctl/__init__.py", line 234, in _request
    "Error: ({})".format(data["response"].get("error", "Not Available"))
KeyError: 'response'

Interesting information: The Backup is not in the list of Backups or on the Website of DNA anymore. So from my point of view, the Backup was deleted, but DNA Center has an error in responding.

The Backup Size is about 400GB. Is it possible that there is a timing problem? The Backup is stored with Assurance data on an NFS Storage.

Kind Regards

robertcsapo commented 2 years ago

@T185 we haven't tested this on NFS (Assurance) based backups. Because they are incremental in sizing.

"Automation data consists of Cisco DNA Center databases, credentials, file systems, and files. The automation backup is a full backup."

vs

"The Assurance data consists of network assurance and analytics data. The first backup of Assurance data is a full backup. After that, backups are incremental."

https://www.cisco.com/c/en/us/td/docs/cloud-systems-management/network-automation-and-management/dna-center/2-3-3/admin_guide/b_cisco_dna_center_admin_guide_2_3_3/b_cisco_dna_center_admin_guide_2_3_3_chapter_0110.html

But based on your debug, we could catch this. {"message":"The request to the backend service is timing out. Please refer to the backend service's logs for more details."}

"The Backup is not in the list of Backups or on the Website of DNA anymore. So from my point of view, the Backup was deleted, but DNA Center has an error in responding."

If the Cisco DNA Center isn't able to recover, you should open a Cisco TAC to investigate the issue.

T185 commented 2 years ago

Hi Robert,

thank you for your answer, i will open a Cisco TAC for this and keep you updated!

Kind regards