Closed Surrina-ki closed 2 weeks ago
Hi @Surrina-ki, thanks for reporting!
The strange thing is that at the stage MOVE MANIFEST manager should already uploaded all files, so all tables should have 100% progress or some errors. So this looks like either progress display issue, or a size calculation issue.
Could you send manager logs so that I can validate, what happened?
The strange thing is that at the stage MOVE MANIFEST manager should already uploaded all files, so all tables should have 100% progress or some errors. So this looks like either progress display issue, or a size calculation issue.
It's not just a display issue. Although it shows "DONE," when I use it to restore data, it reports an error.
Could you send manager logs so that I can validate, what happened?
Here's the log that appeared when I executed the following command: sctool backup -c test -L 's3:test' -K 'test_keyspace2' --rate-limit 100 sctool -c test stop backup/1f2c7378-fd23-46c1-82b3-6528b5f27f86 sctool -c test progress backup/1f2c7378-fd23-46c1-82b3-6528b5f27f86
Then I executed the start command:
What SM version are you using? If it's older than 3.2.6, then I suspect that this is because of https://github.com/scylladb/scylla-manager/issues/3729 which was fixed in SM 3.2.6. If that's the case, please update SM to the newest version and see if errors during upload stage are reported in a correct way.
What SM version are you using? If it's older than 3.2.6, then I suspect that this is because of #3729 which was fixed in SM 3.2.6. If that's the case, please update SM to the newest version and see if errors during upload stage are reported in a correct way.
Thank you. I was using version 3.2.3. After switching to version 3.2.8, I was able to resume backups normally after stopping and starting.
@Michal-Leszczynski I also want to ask, there are many logs like the one below in the agent logs. Could this cause any issues?
{"L":"INFO","T":"2024-06-13T06:52:42.008Z","M":"http: TLS handshake error from 192.168.100.4:50596: EOF"} {"L":"INFO","T":"2024-06-13T06:52:42.008Z","M":"http: TLS handshake error from 192.168.100.4:62548: read tcp 192.168.100.100:10001->192.168.100.4:62548: read: connection reset by peer"}
They can happen in some fragments of the logs. They are probably connected to some temporary connectivity/infrastructure issues. You don't need to worry about them, as if they broke something, it would also be visible in other, more specific error messages.
I encountered some issues while testing the pause and resume functionality of Scylla Manager's backup operations. First, I started a backup task using the command. Then, I paused the task using the stop command. When I attempted to resume the task using the start command, the status quickly changed to DONE, but the progress did not reach 100%.
I then tried using suspend and resume, but the resume did not take effect.