neuro-inc / neuro-cli

Platform-specific API and CLI python client
https://neu-ro.gitbook.io/neu-ro-cli-reference/
Other
18 stars 7 forks source link

Issues with copying a large tree #1093

Closed serhiy-storchaka closed 4 years ago

serhiy-storchaka commented 4 years ago

I have encountered the following issues when copying a large tree:

1.

$ time neuro --show-traceback storage cp -r storage:fastai /tmp          
Copy storage://serhiystorchaka/fastai => file:///tmp/fastai
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/annotations/xmls DONE
basset_hound_41.png [0.00%] 0B of 2.842K
._american_bulldog_152.png [0.00%] 0B of 240B
._British_Shorthair_52.png [0.00%] 0B of 240B
._Maine_Coon_71.png [0.00%] 0B of 240B
._english_cocker_spaniel_44.png [0.00%] 0B of 240B
._Russian_Blue_226.png [0.00%] 0B of 240B
._staffordshire_bull_terrier_110.png [0.00%] 0B of 240B
._Sphynx_29.png [0.00%] 0B of 240B
wheaten_terrier_9.png [0.00%] 0B of 2.305K
ERROR: Connection error (Response payload is not completed)
Traceback (most recent call last):
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/main.py", line 288, in main
    cli.main(args=args, standalone_mode=False)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/decorators.py", line 27, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/utils.py", line 191, in wrapper
    debug=root.verbosity >= 2,  # see main:setup_logging for constants
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/asyncio_utils.py", line 63, in run
    return loop.run_until_complete(main_task)
  File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/utils.py", line 139, in _run_async_function
    return await func(root, *args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/storage.py", line 335, in cp
    src, dst, update=update, progress=progress_obj
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 563, in download_dir
    src, dst, path, update=update, progress=progress, queue=queue
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 695, in _run_progress
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 683, in wrapped
    await coro
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 536, in _download_file
    async for chunk in self.open(src):
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 254, in open
    async for data in resp.content.iter_any():
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/streams.py", line 39, in __anext__
    rv = await self.read_func()
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/streams.py", line 380, in readany
    await self._wait('readany')
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/streams.py", line 296, in _wait
    await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed

real    17m54.088s
user    1m34.285s
sys     0m15.152s

2.

$ time neuro --show-traceback storage cp -r storage:fastai /tmp
Copy storage://serhiystorchaka/fastai => file:///tmp/fastai
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/images DONE
British_Shorthair_185.png 2.006K
._newfoundland_169.png 240B
._yorkshire_terrier_97.png [0.00%] 0B of 240B
newfoundland_137.png [0.00%] 0B of 2.65K
staffordshire_bull_terrier_30.png 2.464K
._pug_181.png [0.00%] 0B of 240B
samoyed_5.png 1.72K
._Sphynx_78.png [0.00%] 0B of 240B
Abyssinian_217.png [0.00%] 0B of 1.588K
ERROR: cannot copy storage://serhiystorchaka/fastai to file:///tmp/fastai: {"error": "Unexpected exception TimeoutError: . Path with query: /api/v1/storage/serhiystorchaka/fastai/data/oxford-iiit-pet/annotations/trimaps/._english_cocker_spaniel_70.png?op=OPEN."}

real    56m50.596s
user    3m48.978s
sys     0m36.420s

3.

$ time neuro --show-traceback storage cp -r storage:fastai /tmp
Copy storage://serhiystorchaka/fastai => file:///tmp/fastai
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/annotations/xmls DONE
american_pit_bull_terrier_37.png 3.146K
._havanese_139.png 240B
._Sphynx_66.png 240B
keeshond_94.png 3.878K
great_pyrenees_151.png 2.599K
._japanese_chin_109.png 240B
boxer_183.png 2.427K
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/annotations/trimaps DONE
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/annotations DONE

(and hanged).

The first two could be solved by catching an error and retrying the request. But the later likely means a deadlock.

serhiy-storchaka commented 4 years ago

4.

$ time neuro --show-traceback storage cp -r storage:fastai /tmp                                                                                                                
Copy storage://serhiystorchaka/fastai => file:///tmp/fastai                                                                                                                                                                                  
storage://serhiystorchaka/fastai/data/oxford-iiit-pet/images ...
english_cocker_spaniel_119.jpg [0.00%] 0B of 122.2K
american_bulldog_167.jpg [0.00%] 0B of 82.48K
scottish_terrier_111.jpg [0.00%] 0B of 132.9K
german_shorthaired_178.jpg [0.00%] 0B of 145.6K
pug_12.jpg [0.00%] 0B of 88.36K
american_bulldog_27.jpg [0.00%] 0B of 155K
american_pit_bull_terrier_76.jpg [0.00%] 0B of 180.5K
Maine_Coon_107.jpg [0.00%] 0B of 74.53K
japanese_chin_41.jpg [0.00%] 0B of 136.4K
ERROR: Connection error (None)
Traceback (most recent call last):
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/main.py", line 288, in main
    cli.main(args=args, standalone_mode=False)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/click/decorators.py", line 27, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/utils.py", line 191, in wrapper
    debug=root.verbosity >= 2,  # see main:setup_logging for constants
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/asyncio_utils.py", line 63, in run
    return loop.run_until_complete(main_task)
  File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/utils.py", line 139, in _run_async_function
    return await func(root, *args, **kwargs)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/cli/storage.py", line 335, in cp
    src, dst, update=update, progress=progress_obj
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 563, in download_dir
    src, dst, path, update=update, progress=progress, queue=queue
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 695, in _run_progress
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 683, in wrapped
    await coro
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 631, in _download_dir
    await _run_concurrently(tasks)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 668, in _run_concurrently
    await task
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 536, in _download_file
    async for chunk in self.open(src):
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/storage.py", line 253, in open
    async with self._core.request("GET", url, timeout=timeout) as resp:
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/async_generator/_util.py", line 34, in __aenter__
    return await self._agen.asend(None)
  File "/home/serhiy/neuromation/platform-client-python/neuromation/api/core.py", line 119, in request
    timeout=timeout,
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/client.py", line 1012, in __aenter__
    self._resp = await self._coro
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/client.py", line 504, in _request
    await resp.start(conn)
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 847, in start
    message, payload = await self._protocol.read()  # type: ignore  # noqa
  File "/home/serhiy/neuromation/platform-client-python/venv/lib/python3.6/site-packages/aiohttp/streams.py", line 591, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: None

real    6m8.857s
user    0m11.599s
sys     0m1.468s
serhiy-storchaka commented 4 years ago

1107 was purposed to fix cases like 2 and 4. It could also fix case 1. But it unlikely has relation with case 3.

Meanwhile, I cannot reproduce any of issues now. I ran uploading a large tree 100 times, and experimented with larger trees and different client configurations, but all were passed successfully. Maybe #1107 have larger affect than I expected, or it was affected by changes on the server side. In any case I am closing this issue. If we encounter any problem, we will open a new issue.