SpectraLogic / ds3_java_browser

Apache License 2.0
4 stars 9 forks source link

Error on DELETE of larger data sets. #318

Open michaelmehl opened 7 months ago

michaelmehl commented 7 months ago

I'm not certain what the cutoff size is yet but, when we DELETE larger folders in a bucket, it appears that the EON browser times out, and errors, but after waiting long enough, it appears to eventually delete the data. I do not think this is related to the 500k limitation.

2023-09-07 13:34:38,999 INFO [JavaFX Application Thread] c.s.d.g.c.d.Ds3PanelPresenter [Ds3PanelPresenter.java:509] Got delete object event 2023-09-07 13:34:39,003 INFO [JavaFX Application Thread] c.s.d.g.c.d.Ds3PanelPresenter [Ds3PanelPresenter.java:519] Delete folder TreeItem [ value: com.spectralogic.dsbrowser.gui.components.ds3panel.ds3treetable.Ds3TreeTableValue@value ] 2023-09-07 13:34:39,004 INFO [JavaFX Application Thread] c.s.d.g.s.d.DeleteService [DeleteService.java:123] Got delete folder event 2023-09-07 13:34:43,621 INFO [pool-3-thread-9] c.s.d.n.NetworkClientImpl [NetworkClientImpl.java:221] Sending request: DELETE http://IPADDRESS:80 /rest/folder/PATH 2023-09-07 13:34:43,622 DEBUG [pool-3-thread-9] c.s.d.u.Signature [Signature.java:52] String to sign: DELETE\n\napplication/xml\nThu, 07 Sep 2023 18:34:43 +0000\n/rest/folder/PATH 2023-09-07 14:34:43,740 ERROR [pool-3-thread-9] c.s.d.g.s.t.CancelAllRunningJobsTask [Ds3DeleteFoldersTask.java:71] Failed to delete folder FOLDER:PATH java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at com.spectralogic.ds3client.networking.NetworkClientImpl$RequestExecutor.execute(NetworkClientImpl.java:234) at com.spectralogic.ds3client.networking.NetworkClientImpl.getResponse(NetworkClientImpl.java:174) at com.spectralogic.ds3client.Ds3ClientImpl.deleteFolderRecursivelySpectraS3(Ds3ClientImpl.java:838) at com.spectralogic.dsbrowser.gui.services.tasks.Ds3DeleteFoldersTask.call(Ds3DeleteFoldersTask.java:67) at com.spectralogic.dsbrowser.gui.services.tasks.Ds3DeleteFoldersTask.call(Ds3DeleteFoldersTask.java:37) at javafx.concurrent.Task$TaskCallable.call(Task.java:1423) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2023-09-07 14:34:43,742 ERROR [JavaFX Application Thread] c.s.d.g.c.d.DeleteItemPresenter [DeleteItemPresenter.java:180] Failed to delete selected item(s): folderDeleteFailed null

blitt commented 7 months ago

Hi,

Would you happen to have the BP related logs? A manual logset would be helpful in determining the cause.

Thanks

michaelmehl commented 7 months ago

I have them attached to a Support Portal case # 317562. I was referred to raising an issue here. Do you have access to that area?

blitt commented 7 months ago

Checking. How many objects were in the bucket? Thx

michaelmehl commented 7 months ago

In the bucket, closer to a million. The folder failed mid-PUT, which resulted in less than 277,950 sub-objects.

blitt commented 7 months ago

Hi, I did find the associated BP logs. I will assume the folder delete was for cleanup of the 277,950 objects. Deletes are not fast. Additionally, this system only has 64 GB system. I am a bit surprised at the length of time, but this is also an older version of the BP image and Postgres. This is not an Eon Browser bug.

What is happening is the deletes are taking a very long time and reach the 2 hour timeout in the SDK.   The deletes continue but the BP returns a 503 because of the timeout since this is a sycnchronous request.   We can see in the DataPlanner rpc logs that they are still going.  The deletes will succeed eventually and we have never taken the time to redesign the request to make it async with a status update.  Deletes of this magnitude are rare in a production system.

var.log.tomcat.server-rpc.log:INFO Sep 08 07:07:47,267 [WorkLogger] | Still in progress after 12 hours: [http-apr-0:0:0:0:0:0:0:0-8080-exec-86] RPC TargetManager.deleteObjects<1053623>  (MonitoredWorkManager$WorkLogger.run:84)

michaelmehl commented 7 months ago

Thank you for taking the time to look at this. It could be I need to rethink how I'm utilizing the system, if this is rare for you.

Typically, I archive/containerize large count data sets (>100,000) objects, which we regularly see, programmatically with 7zip to avoid the PUT object limitation. This go around was comprised of video objects, which if possible, I avoid that workflow.

blitt commented 7 months ago

No worries. The limitation for objects is 500K objects per job (no change) and up to 10K jobs (was 1K jobs) with the BP 5.2.x or later release. If you are doing PUT jobs using zip files then I am not sure I follow how we are deleting individual objects within a given zip file. Are you saying that you have over 100K zip files in a folder under the bucket? Perhaps we should discuss in standard email chain or short phone discussion. Can you provide your email? Thx

michaelmehl commented 7 months ago

In this instance, they were not zipped. I've added my email to the ticket this spawned out of.