Closed hajes closed 1 month ago
impossible to chia start farmer -r
anymore
2024-07-21T21:36:18.257 daemon chia.daemon.server : INFO Connection close requested. Closing websocket with ['Unknown']. 2024-07-21T21:37:09.588 daemon chia.daemon.server : INFO Daemon Server stopping, Services stopped: [] 2024-07-21T21:37:09.588 daemon chia.daemon.server : INFO Connection closed. Closing websocket with ['Unknown']. 2024-07-21T21:37:09.589 daemon chia.daemon.server : INFO chia daemon exiting 2024-07-21T21:37:09.589 daemon chia.daemon.server : INFO Daemon WebSocketServer closed 2024-07-21T21:37:12.109 daemon chia.daemon.server : INFO chia-blockchain version: 2.4.2 2024-07-21T21:37:12.136 daemon chia.daemon.server : INFO Starting Daemon Server (localhost:55400) 2024-07-21T21:37:14.162 daemon chia.daemon.server : ERROR problem starting chia_harvester Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1286, in start_service process, pid_path = launch_service(self.root_path, exe_command) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1468, in launch_service process = subprocess.Popen( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'chia_harvester' 2024-07-21T21:37:14.164 daemon chia.daemon.server : ERROR problem starting chia_farmer Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1286, in start_service process, pid_path = launch_service(self.root_path, exe_command) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1468, in launch_service process = subprocess.Popen( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'chia_farmer' 2024-07-21T21:37:14.166 daemon chia.daemon.server : ERROR problem starting chia_full_node Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1286, in start_service process, pid_path = launch_service(self.root_path, exe_command) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1468, in launch_service process = subprocess.Popen( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'chia_full_node' 2024-07-21T21:37:14.168 daemon chia.daemon.server : ERROR problem starting chia_wallet Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1286, in start_service process, pid_path = launch_service(self.root_path, exe_command) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/daemon/server.py", line 1468, in launch_service process = subprocess.Popen( ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'chia_wallet' 2024-07-21T21:37:14.169 daemon chia.daemon.server : INFO Connection close requested. Closing websocket with ['Unknown'].
looks like latest version doesn't like chia-blockchain/venv/bin/chia start farmer -r
. after activating python venv, it goes again.
I have today observed a live crash. Watchdog script kicked in, chia stop all -d
successfully executed...yet, the following chia stuff still running. chia start farmer -r
also failed. What the hell is going on?
ps aux | grep chia hajes 2113688 0.0 2.1 1326160 354124 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113689 0.0 2.1 1326160 354012 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113690 0.0 2.1 1326160 354584 ? S Jul21 0:18 chia_full_node_block_validation_worker hajes 2113691 0.0 2.1 1326160 354072 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113692 0.0 2.1 1326160 353788 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113693 0.0 2.1 1326160 354188 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113694 0.0 2.1 1326160 353844 ? S Jul21 0:14 chia_full_node_block_validation_worker hajes 2113695 0.0 2.1 1326160 354268 ? S Jul21 0:18 chia_full_node_block_validation_worker hajes 2113696 0.0 2.1 1326160 354144 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113697 0.0 2.1 1326160 354044 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113698 0.0 2.1 1326160 354028 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113699 0.0 2.1 1326160 354524 ? S Jul21 0:18 chia_full_node_block_validation_worker hajes 2113700 0.0 2.1 1326160 354412 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113701 0.0 2.1 1326160 354028 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113703 0.0 2.1 1326160 354456 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113704 0.0 2.1 1326160 354884 ? S Jul21 0:18 chia_full_node_block_validation_worker hajes 2113705 0.0 2.1 1326160 356532 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113706 0.0 2.1 1326160 356116 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113707 0.0 2.1 1326160 353800 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113708 0.0 2.1 1326160 354084 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113709 0.0 2.1 1326160 354616 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113710 0.0 2.1 1326160 354120 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113711 0.0 2.1 1326160 353924 ? S Jul21 0:15 chia_full_node_block_validation_worker hajes 2113712 0.0 2.1 1326160 353968 ? S Jul21 0:14 chia_full_node_block_validation_worker hajes 2113713 0.0 2.1 1326160 354136 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113714 0.0 2.1 1326160 354056 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113715 0.0 2.1 1326160 354008 ? S Jul21 0:13 chia_full_node_block_validation_worker hajes 2113716 0.0 2.1 1326160 354284 ? S Jul21 0:17 chia_full_node_block_validation_worker hajes 2113717 0.0 2.1 1326160 353924 ? S Jul21 0:16 chia_full_node_block_validation_worker hajes 2113718 0.0 2.1 1326160 354060 ? S Jul21 0:13 chia_full_node_block_validation_worker hajes 2113719 0.0 2.1 1326160 354416 ? S Jul21 0:18 chia_full_node_block_validation_worker hajes 2113720 0.0 2.1 1326160 353936 ? S Jul21 0:14 chia_full_node_block_validation_worker hajes 2114674 0.0 2.2 1399972 370480 ? S Jul21 0:12 chia_full_node_mempool_worker hajes 2114675 0.0 2.2 1399972 369632 ? S Jul21 0:12 chia_full_node_mempool_worker hajes 2246173 0.0 0.0 6332 2064 pts/1 S+ 06:41 0:00 grep chia
With following error
Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/util/lock.py", line 42, in acquire self._lock.acquire(timeout=timeout, poll_interval=poll_interval) File "/home/hajes/chia-blockchain/venv/lib/python3.11/site-packages/filelock/_api.py", line 304, in acquire raise Timeout(lock_filename) # noqa: TRY301 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ filelock._error.Timeout: The file lock '/home/hajes/.chia/mainnet/run/full_node.lock' could not be acquired. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/hajes/chia-blockchain/chia/server/start_service.py", line 196, in run with Lockfile.create(service_launch_lock_path(self.root_path, self._service_name), timeout=1): File "/home/hajes/chia-blockchain/chia/util/lock.py", line 29, in enter self.acquire(timeout=self.timeout, poll_interval=self.poll_interval) File "/home/hajes/chia-blockchain/chia/util/lock.py", line 44, in acquire raise LockfileError(e) from e chia.util.lock.LockfileError: The file lock '/home/hajes/.chia/mainnet/run/full_node.lock' could not be acquired. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/hajes/chia-blockchain/venv/bin/chia_full_node", line 8, in <module> sys.exit(main()) ^^^^^^ File "/home/hajes/chia-blockchain/chia/server/start_full_node.py", line 102, in main return async_run(coro=async_main(service_config), connection_limit=target_peer_count) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/server/start_service.py", line 323, in async_run return asyncio.run(coro) ^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/home/hajes/chia-blockchain/chia/server/start_full_node.py", line 85, in async_main await service.run() File "/home/hajes/chia-blockchain/chia/server/start_service.py", line 201, in run raise ValueError(f"{self._service_name}: already running") from e ValueError: full_node: already running
Only pkill -9 -f chia
worked.
I came back from work, farmer dead again.
Hey @hajes , it looks like the issues might be related to the db being corrupt:
2024-07-20T03:55:33.777 full_node chia.full_node.full_node: ERROR sync from fork point failed: DatabaseError: database disk image is malformed
I would:
~/.chia/mainnet/config/config.yaml
Once chia is synced you will need to update the config file for any custom settings:
Let us know if you run into any issues or have any questions in the process and keep in mind that we can generally provide more timely and thorough support in our discord server (https://discord.gg/chia)
Thanks for reply @BrandtH22, and my apologies for Discord rage few days ago...one should sleep overnight instead of trying to fix something, and then talk blulsiht... :o)
Already did all above as written in initial post. What you mention is 2 days old...all this crashing corrupted database, and refused to sync.
After fresh install, still crashing all the time. I have log in debug mode. It is almost 300MB compressed if you are interested
Last crash was about 7:30, and then somewhere between 7:30 - 16:00 no idea when exactly it crashed when I was at work
So far I have a kludge - watchdog that watches logs, and restart chia or kill ghost processes. So far it seems to work...just 15 hours gone in name of new state-of-art farmer :-(
Reinstalled whole system, and it is running. Something got screwed up during/after install.
My celebration of "success" was premature. Freshly installed farmer crashed after 16 hours. Chia official systemd
script didn't restart the failing process.
I noticed chia processes eat lots of memory/swap. 1x14 threads C4 should use <500MB of RAM. System has got 16GB of RAM + 2GB swap. It is filled with buffered crap, memory leaks...no idea.
now what?
Through other issues I think the thought on our side is there is some kind of hardware issue causing system flakiness. It may be some other component besides the RMAd motherboard at fault.
What happened?
Check out log output guys because there are so many errors, no idea what is exactly issue.
I have installed a new rig with PCIe 4 to handle modern GPUs.
Node runs latest Debian Bookworm.
I have tried Chia Debian version...crashes randomly after hours or day of farming. Removed the Debian version of Chia, and installed from Git source because I have used it in old server for years without issues.
Everything has been made fresh new, including initial sync. This time, farming run about 10 hours, and crashed again with what seems to be same errors.
Systemd suppose to restart failed process...what is interesting systemd claims status RUNNING, all green no errors...but everything related to chia is dead...
chia show -s
orchia farm summary
no response.Any suggestions why, please? The only thing that changed is latest version of Chia 2.4.1 > 2.4.2
After manual
systemctl restart chia-farmer.service
, everything runs again until it crashes later on.Version
2.4.2
What platform are you using?
Linux
What ui mode are you using?
CLI
Relevant log output