Closed ca-scribner closed 2 years ago
Update: this just happened to me when copying a few large files too. Seems like it happens intermittently and its just easier to find with many small files?
Assigned to saffa more for investigation. Might need someone else to support/fix
local -> minio
mc cp <file> <destination>
in R-Studio and Jupyter Notebooks:
minio -> local
trying to just select the file and copy it to minio manually R-Studio:
Jupyter Notebooks:
using mc cp <file> <destination>
in R-Studio and Jupyter Notebooks:
minio -> minio
trying to just select the file and copy it to minio manually R-Studio:
Jupyter Notebooks:
using mc cp <file> <destination>
in R-Studio and Jupyter Notebooks:
After an error appears in r-studio
, the whole program starts to slow down all processes (had to delete and create a new notebook server).
From the above testing, it looks like there's no issue with minio itself, but instead with the mounting. For now, the best way to copy files between local
and minio
is to use the mc cp <file> <destination>
command.
USING MINIMAL-TENANT-1
local -> minio
*When you click paste, the rest of the notebook server becomes unresponsive until either it has pasted or an error occurs
JupyterLab-CPU:
100 MB: Notebook becomes unresponsive (can’t type into terminal or create a new notebook). After about 3 minutes, file was copied over to minio and everything went back to normal.
150 MB: After 3 mins and 30 seconds, got an error but everything went back to normal after. File not copied over.
Paste Error
Unexpected error while saving file: minio/minimal-tenant1/private/test150m [Errno 5] Input/output error
1 GB: After 5 minutes, got an error then server was completely unresponsive. File not copied over.
Paste Error Invalid response: 504
R-Studio:
102 MB: Took 2 mins 45 seconds, copied over successfully
150 MB: After 3.5 minutes, get an input/output error. Then server is responsive again. File not copied over.
1 GB: After 5 minutes, get an error and some parts of server not responsive (directory, terminal). File not copied over. Cancelled out of server and re-connected, took a long time to load.
Status code 504 returned
minio -> local
JupyterLab-CPU:
100 MB: Copied over in 5 seconds
150 MB: Copied over in 7 seconds
1 GB: Copied over in 50 seconds
R-Studio
102 MB: Copied over in less than 1 second
150 MB: Copied over in less than 1 second
1 GB: Copied over in 3 seconds
@saffaalvi was this done against the new tenants?
Can the tests also be performed with MC as copy/pasting in the UI, means now the browser is coordinating things but need to look into more ^_^
@sylus The testing still isn't completed but I noticed the mc cp
behaviour was pretty similar to the last time I tested it, it copied over successfully and quickly. This was done with minimal-tenant-1, should I be trying it with standard-tenant-1?
USING STANDARD-TENANT-1
local -> minio
*When you click paste, the rest of the notebook server becomes unresponsive until either it has pasted or an error occurs
JupyterLab-CPU
Paste Error Invalid Response: 504
error. File did not copy over. Notebook was unresponsive after. When I exited out of this server after, it wouldn’t reconnect. Had to create a new notebook server to proceed with tests.R-studio:
Status code 504 returned
error. File not copied over. Server took a long time to re-connect after this. When I exited out of this server after, it wouldn’t reconnect. Had to create a new notebook server to proceed with tests.Using mc cp <file-name> standard/<bucket-name>
JupyterLab-CPU:
R-Studio:
minio -> local
JupyterLab-CPU:
R-Studio
Using mc cp standard/<bucket-name>/<file-name> /home/jovyan
JupyterLab-CPU:
R-studio
minio -> minio
JupyterLab-CPU
Paste Error: Unexpected error while saving file: minio/standard-tenant-1/private/test150m-Copy1 [Errno 5] Input/output error
after 2 mins and 53 seconds Paste Error: Invalid response: 504
. File not copied over. Server unresponsive after.R-studio:
Status code 504 returned
. File not copied over.Using mc cp standard/<bucket-name>/<file-name> standard/<bucket-name>/<file-name>-Copy
JupyterLab-CPU:
R-studio
Summary of standard-tenant-1 results compared to testing with minimal-tenant-1 from Nov. 19, 2020:
local -> minio:
minio -> local:
minio -> minio
Results do seem to be a little inconsistent, I would try the same process a few times and get different results as recorded above. @ca-scribner noticed this too when trying to use mc cp
with the 1GB file.
To clarify, the JupyterLab-CPU/rstudio entries are for copy/pasting in the respective file browser, and mc cp is for the terminal command?
On Fri, Dec 18, 2020 at 15:01 Saffa Alvi notifications@github.com wrote:
Summary of standard-tenant-1 results compared to testing from Nov. 19, 2020: local -> minio:
- JupyterLab-CPU: Handled the 150 MB file better than last time since there was no error, but still took over 2 minutes to copy. Still unable to handle the 1 GB file, server still crashed with error.
- R-Studio: Last time, breaking point was any file larger than 101MB, but this time, was able to copy over a 150 MB file. Still unable to handle the 1 GB file, server still crashed with error.
- Using mc cp: Both notebook servers behaved like last time, but were maybe a little faster.
minio -> local:
- JupyterLab-CPU: Handles the 1 GB file better than last time, but still took long to copy over (48 seconds)
- R-Studio: Same results as last time
- Using mc cp: For some reason, copying over the 1 GB file now took WAY longer when last time, it only took 5-8 seconds. The file hadn’t copied over in both notebook servers even after 20 minutes.
minio -> minio
- JupyterLab-CPU: Got an error for the 1 GB file much later than last time, but it was still unable to copy over.
- R-Studio: Same response as last time.
- Using mc cp: Quicker than last time.
Results do seem to be a little inconsistent, I would try the same process a few times and get different results as recorded above. @ca-scribner https://github.com/ca-scribner noticed this too when trying to use mc cp with the 1GB file.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/StatCan/daaas/issues/236#issuecomment-748292664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALPFPI4SO3OYBAXXNOR6JULSVOYLDANCNFSM4R7MN6BA .
@ca-scribner yes, mc cp
also has JupyterLab-CPU/R-Studio entries below it to show which notebook server the terminal command was done in and the results.
Thanks @saffaalvi for the information you provided is super helpful!
Work proceeding over at https://github.com/StatCan/daaas/issues/348
When doing speed tests on copying with minio vs attached disk, found
Input/output error
during copy action when working with >100's of filesspeed test copied n files in the following ways:
local -> minio
minio -> local
minio -> minio
Found that
minio -> local
orminio -> minio
work well with small numbers of files (n<~100) but breaks most times when n>200, givingInput/output error
from the copy action (eg: the source file for the copy action is not available).After process fails, I can see:
mc cp minio_tenant/...file
)cp minio_file_that_failed some_destination
it will consistently return theInput/output error
for some time (seconds to minutes), then eventually thecp
will work and continue to work indefinitely.To reproduce, you can use the code here
cases
list of dict to use size=1k, n=1000minio -> minio
copy step for failure