CoCoPIE-Group / XGen-Report

The repository for reporting issues about CoCoPIE XGen.
0 stars 0 forks source link

Fail to start the mongo and redis docker #37

Open hsung2 opened 1 year ago

hsung2 commented 1 year ago

Dear authors,

The version of XGen I use is v1.3.0. After installation, I got this error when I execute the command "xgen_run":

Starting XGen environment...

Traceback (most recent call last):
  File "/usr/local/bin/init_aio_db", line 8, in <module>
    sys.exit(init_aio_db())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/init_db.py", line 25, in init_aio_db
    asyncio.run(do_init())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/init_db.py", line 17, in do_init
    await initiate_database()
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/app/database.py", line 25, in initiate_database
    await init_beanie(
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 530, in init_beanie
    await Initializer(
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 89, in __await__
    yield from self.init_class(model).__await__()
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 500, in init_class
    await self.init_document(cls)
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 324, in init_document
    build_info = await self.database.command({"buildInfo": 1})
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/pymongo/database.py", line 805, in command
    with self.__client._socket_for_reads(read_preference, session) as (
  File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1296, in _socket_for_reads
    server = self._select_server(read_preference, session)
  File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1257, in _select_server
    server = topology.select_server(server_selector)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 272, in select_server
    server = self._select_server(selector, server_selection_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 261, in _select_server
    servers = self.select_servers(selector, server_selection_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 223, in select_servers
    server_descriptions = self._select_servers_loop(selector, server_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 238, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: xgen_mongodb:27017: [Errno -3] Temporary failure in name resolution, Timeout: 30s, Topology Description: <TopologyDescription id: 653421f3231a643eb83f8ec8, topology_type: Unknown, servers: [<ServerDescription ('xgen_mongodb', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('xgen_mongodb:27017: [Errno -3] Temporary failure in name resolution')>]>

I still can execute xgen_run in the second time but getting the connection error with redis:

Starting XGen environment...

Successfully entered the XGen environment. Next, you can type command "XGen" to interact with the powerful toolchain.
(xgen) root@hsung2-MS-7A45:~# XGen
Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 716, in connect
    lambda: self._connect(), lambda error: self.disconnect(error)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/retry.py", line 46, in call_with_retry
    return do()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 716, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 781, in _connect
    raise err
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 769, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/xgen/bin/XGen", line 33, in <module>
    sys.exit(load_entry_point('xgen-main==1.2.3', 'console_scripts', 'XGen')())
  File "/usr/local/miniconda3/envs/xgen/bin/XGen", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/importlib_metadata/__init__.py", line 209, in load
    module = import_module(match.group('module'))
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/main.py", line 5, in <module>
    from xgen.device_lab.cancel_job import cancel_job
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/device_lab/__init__.py", line 1, in <module>
    from .helper import get_all_devices
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/device_lab/helper.py", line 6, in <module>
    from xgen.utils.redis_ipc import RedisIPC
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/utils/redis_ipc.py", line 14, in <module>
    class RedisIPC:
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/utils/redis_ipc.py", line 24, in RedisIPC
    r.setnx(Config.device_status_key, json.dumps(None))
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/commands/core.py", line 2335, in setnx
    return self.execute_command("SETNX", name, value)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/client.py", line 1255, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 1481, in get_connection
    connection.connect()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 721, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to localhost:12379. Connection refused.

Finally, I use xgencrtl status and it gave me this information:

root@8b9f28b24409:~# xgenctl status 
Settings check passed.
Redis connection failed.
Redis connection failed. Please ensure the xgen_redis container is running.
MQ connection failed.
Supervisor services check failed: ['queue-monitor', 'packer', 'unpacker', 'device-monitor', 'controllerd']
Supervisor services check failed. Please ensure the configuration file located at`$HOME/.xgen_controller/controller.conf` is correct. Then run `xgenctl restart`.

Here is mongo and redis docker status in docker ps -a:

c334021f1b0a   271679491055.dkr.ecr.us-east-1.amazonaws.com/mongo:6                                               "docker-entrypoint.s…"   20 minutes ago   Restarting (14) 32 seconds ago                                                                                                   xgen_mongodb
0d7bdc7a8cc2   271679491055.dkr.ecr.us-east-1.amazonaws.com/redis:7.0                                             "docker-entrypoint.s…"   20 minutes ago   Restarting (1) 37 seconds ago                                                                                                    xgen_redis

What do I miss to do the correct installation?

xinzhang-cocopie commented 1 year ago

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review?

hsung2 commented 1 year ago

Hi Xin,

Thanks for your reply. I fix this problem by changing to another computer. It may caused by old docker version. Now I have expired license issue.

Thanks!

Best, Hsin-Hsuan

On Sat, Oct 21, 2023 at 10:00 PM Xin Zhang @.***> wrote:

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review? Our email is @., @.

— Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1773970148, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RBDWPD3G3TXITRSV43YAR43XAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHE3TAMJUHA . You are receiving this because you authored the thread.Message ID: @.***>

xinzhang-cocopie commented 1 year ago

Hi Hsung2,

Indeed, issues like having a too low Docker version can lead to installation problems. The installation requirements are outlined in the documentation.

Please send me the hash code of your new machine so I can generate a new one for you.

To obtain the machine hash, you can delete the /etc/xgen/license file and run XGen. It will output the hash code.

Alternatively, you can check the content of the /etc/xgen/license file on your new machine. I can update the license for you if it's available in our repository.

Thanks, Xin

发件人: hsung2 @.> 日期: 星期一, 2023年10月23日 09:25 收件人: CoCoPIE-Group/XGen-Report @.> 抄送: Xin Zhang @.>, Comment @.> 主题: Re: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Xin,

Thanks for your reply. I fix this problem by changing to another computer. It may caused by old docker version. Now I have expired license issue.

Dr. Shen told me that I could have a virtual meeting with you to solve the problems.

What time are you available tomorrow? Dr. Shen told me you are in China. I am available on your Tuesday morning.

Thanks!

Best, Hsin-Hsuan

On Sat, Oct 21, 2023 at 10:00 PM Xin Zhang @.***> wrote:

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review? Our email is @., @.

― Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1773970148, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RBDWPD3G3TXITRSV43YAR43XAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHE3TAMJUHA . You are receiving this because you authored the thread.Message ID: @.***>

― Reply to this email directly, view it on GitHubhttps://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774282944, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2I6HDWJQJSPMKLBLH6SSZDYAXBRLAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGI4DEOJUGQ. You are receiving this because you commented.Message ID: @.***>

hsung2 commented 1 year ago

Hi Xin,

I think Dr. Shen help me renew the license. Currently, I can execute the XGen. Thanks for all your help.

Best, Hsin-Hsuan

On Sun, Oct 22, 2023 at 10:49 PM Xin Zhang @.***> wrote:

Hi Hsung2,

Indeed, issues like having a too low Docker version can lead to installation problems. The installation requirements are outlined in the documentation.

Please send me the hash code of your new machine so I can generate a new one for you.

To obtain the machine hash, you can delete the /etc/xgen/license file and run XGen. It will output the hash code.

Alternatively, you can check the content of the /etc/xgen/license file on your new machine. I can update the license for you if it's available in our repository.

Thanks, Xin

发件人: hsung2 @.> 日期: 星期一, 2023年10月23日 09:25 收件人: CoCoPIE-Group/XGen-Report @.> 抄送: Xin Zhang @.>, Comment @.> 主题: Re: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Xin,

Thanks for your reply. I fix this problem by changing to another computer. It may caused by old docker version. Now I have expired license issue.

Dr. Shen told me that I could have a virtual meeting with you to solve the problems.

What time are you available tomorrow? Dr. Shen told me you are in China. I am available on your Tuesday morning.

Thanks!

Best, Hsin-Hsuan

On Sat, Oct 21, 2023 at 10:00 PM Xin Zhang @.***> wrote:

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review? Our email is @., @.

― Reply to this email directly, view it on GitHub < https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1773970148>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AYVG2RBDWPD3G3TXITRSV43YAR43XAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHE3TAMJUHA>

. You are receiving this because you authored the thread.Message ID: @.***>

― Reply to this email directly, view it on GitHub< https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774282944>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/A2I6HDWJQJSPMKLBLH6SSZDYAXBRLAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGI4DEOJUGQ>.

You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774350446, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RBEQDDTZ3JNCKMKRO3YAXLMDAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGM2TANBUGY . You are receiving this because you authored the thread.Message ID: @.***>

xinzhang-cocopie commented 1 year ago

Thanks, Xin! I’ve sent him the license.

From: Xin Zhang @.> Date: Sunday, October 22, 2023 at 10:49 PM To: CoCoPIE-Group/XGen-Report @.> Cc: Xipeng Shen @.>, Maolin Liu @.> Subject: 答复: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Hsung2,

Indeed, issues like having a too low Docker version can lead to installation problems. The installation requirements are outlined in the documentation.

Please send me the hash code of your new machine so I can generate a new one for you.

To obtain the machine hash, you can delete the /etc/xgen/license file and run XGen. It will output the hash code.

Alternatively, you can check the content of the /etc/xgen/license file on your new machine. I can update the license for you if it's available in our repository.

Thanks, Xin

发件人: hsung2 @.> 日期: 星期一, 2023年10月23日 09:25 收件人: CoCoPIE-Group/XGen-Report @.> 抄送: Xin Zhang @.>, Comment @.> 主题: Re: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Xin,

Thanks for your reply. I fix this problem by changing to another computer. It may caused by old docker version. Now I have expired license issue.

Dr. Shen told me that I could have a virtual meeting with you to solve the problems.

What time are you available tomorrow? Dr. Shen told me you are in China. I am available on your Tuesday morning.

Thanks!

Best, Hsin-Hsuan

On Sat, Oct 21, 2023 at 10:00 PM Xin Zhang @.***> wrote:

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review? Our email is @., @.

― Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1773970148, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RBDWPD3G3TXITRSV43YAR43XAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHE3TAMJUHA . You are receiving this because you authored the thread.Message ID: @.***>

― Reply to this email directly, view it on GitHubhttps://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774282944, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2I6HDWJQJSPMKLBLH6SSZDYAXBRLAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGI4DEOJUGQ. You are receiving this because you commented.Message ID: @.***>

hsung2 commented 1 year ago

Hi Xin,

I could execute the XGen now and I would like to train the built-in YOLOv8 following the instructions on documentation. However, I keep getting errors after the first training iteration. I attached the training log file to this email. If you know how to fix this, please let me know. Thank you.

Best, Hsin-Hsuan

On Mon, Oct 23, 2023 at 6:23 AM Xin Zhang @.***> wrote:

Thanks, Xin! I’ve sent him the license.

From: Xin Zhang @.> Date: Sunday, October 22, 2023 at 10:49 PM To: CoCoPIE-Group/XGen-Report @.> Cc: Xipeng Shen @.>, Maolin Liu @.> Subject: 答复: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Hsung2,

Indeed, issues like having a too low Docker version can lead to installation problems. The installation requirements are outlined in the documentation.

Please send me the hash code of your new machine so I can generate a new one for you.

To obtain the machine hash, you can delete the /etc/xgen/license file and run XGen. It will output the hash code.

Alternatively, you can check the content of the /etc/xgen/license file on your new machine. I can update the license for you if it's available in our repository.

Thanks, Xin

发件人: hsung2 @.> 日期: 星期一, 2023年10月23日 09:25 收件人: CoCoPIE-Group/XGen-Report @.> 抄送: Xin Zhang @.>, Comment @.> 主题: Re: [CoCoPIE-Group/XGen-Report] Fail to start the mongo and redis docker (Issue #37) Hi Xin,

Thanks for your reply. I fix this problem by changing to another computer. It may caused by old docker version. Now I have expired license issue.

Dr. Shen told me that I could have a virtual meeting with you to solve the problems.

What time are you available tomorrow? Dr. Shen told me you are in China. I am available on your Tuesday morning.

Thanks!

Best, Hsin-Hsuan

On Sat, Oct 21, 2023 at 10:00 PM Xin Zhang @.***> wrote:

It seems there might have been an issue during the installation. Could you please consider reinstalling it, or alternatively, provide your installation script along with its parameters for us to review? Our email is @., @.

― Reply to this email directly, view it on GitHub < https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1773970148>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AYVG2RBDWPD3G3TXITRSV43YAR43XAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHE3TAMJUHA>

. You are receiving this because you authored the thread.Message ID: @.***>

― Reply to this email directly, view it on GitHub< https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774282944>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/A2I6HDWJQJSPMKLBLH6SSZDYAXBRLAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUGI4DEOJUGQ>.

You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1774887155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RCEU2UTGIHZUPRVO2DYAZATXAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZUHA4DOMJVGU . You are receiving this because you authored the thread.Message ID: @.***>

-- -- Hsin-Hsuan Sung

xinzhang-cocopie commented 1 year ago

Hi hsung2, I haven't seen the attachment you mentioned in your email, and it's not visible on the current issue page either. If possible, could you please open a new issue in the current repository? Thank you.

hsung2 commented 1 year ago

Sure. I uploaded it to a new issue on GitHub.

On Mon, Oct 23, 2023 at 9:52 PM Xin Zhang @.***> wrote:

Hi hsung2, I haven't seen the attachment you mentioned in your email, and it's not visible on the current issue page either. If possible, could you please open a new issue in the current repository? Thank you.

— Reply to this email directly, view it on GitHub https://github.com/CoCoPIE-Group/XGen-Report/issues/37#issuecomment-1776359681, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVG2RH73EZBTCVFLTWXIKDYA4NPDAVCNFSM6AAAAAA6KLED7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZWGM2TSNRYGE . You are receiving this because you authored the thread.Message ID: @.***>

-- -- Hsin-Hsuan Sung