Closed IgnacioHeredia closed 4 years ago
Hi @IgnacioHeredia, I tried this DEEPaaS branch and got this error with MODS running TF2.0.1:
2020-03-06 14:12:05.545 154 INFO deepaas.api [-] Serving loaded V2 models: ['mods']
2020-03-06 14:12:05.546 154 CRITICAL deepaas [-] Unhandled error: AttributeError: 'CancellablePool' object has no attribute 'submit'
2020-03-06 14:12:05.546 154 ERROR deepaas Traceback (most recent call last):
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/bin/deepaas-run", line 8, in <module>
2020-03-06 14:12:05.546 154 ERROR deepaas sys.exit(main())
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/deepaas/cmd/run.py", line 118, in main
2020-03-06 14:12:05.546 154 ERROR deepaas port=CONF.listen_port,
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/aiohttp/web.py", line 433, in run_app
2020-03-06 14:12:05.546 154 ERROR deepaas reuse_port=reuse_port))
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
2020-03-06 14:12:05.546 154 ERROR deepaas return future.result()
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/aiohttp/web.py", line 296, in _run_app
2020-03-06 14:12:05.546 154 ERROR deepaas app = await app # type: ignore
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/deepaas/api/__init__.py", line 101, in get_app
2020-03-06 14:12:05.546 154 ERROR deepaas await m.warm()
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/deepaas/model/v2/wrapper.py", line 233, in warm
2020-03-06 14:12:05.546 154 ERROR deepaas fs = [run(executor, func) for i in range(0, n)]
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/local/lib/python3.6/dist-packages/deepaas/model/v2/wrapper.py", line 233, in <listcomp>
2020-03-06 14:12:05.546 154 ERROR deepaas fs = [run(executor, func) for i in range(0, n)]
2020-03-06 14:12:05.546 154 ERROR deepaas File "/usr/lib/python3.6/asyncio/base_events.py", line 655, in run_in_executor
2020-03-06 14:12:05.546 154 ERROR deepaas return futures.wrap_future(executor.submit(func, *args), loop=self)
2020-03-06 14:12:05.546 154 ERROR deepaas AttributeError: 'CancellablePool' object has no attribute 'submit'
2020-03-06 14:12:05.546 154 ERROR deepaas
Hi @Stifo, I looks like something related to the warm method (which I didn't tested). I'll fix this on Monday. Thanks :)
Hi @Stifo , I should be fixed now. Can you confirm it is working and that you can do a predict then train for example?
Hello @IgnacioHeredia, I apologize for late response. I've currently tested the c272428ae5db08170c539b6fc77af0b9c4f7bfe1 commit and it worked. I was able to train a model using GPU and then make several predictions with the newly trained model.
Thanks @Stifo that's great!
This fixes GPU out-of-memory problems that happened when we had two different pools (for predict and train). When we did train then predict sequentially (or viceversa) each pool wanted to have the whole GPU so out-of-memory errors happened. This won't fix out-of-memory errors when running parallel tasks on GPU (errors which also happened before).
CPU deployments shouldn't be affected.
This has been tested with the image classification package on tf 1.14 and GPU (GeForce GTX 1080). Summary of results:
Additional tests on CPU: