Closed reqlez closed 2 years ago
@reqlez : A pleasure to have your involvement. Thank you for your suggestions. I will review and get back to you. -Cheers!
@reqlez : Can you check and ensure you have the latest changes (git-pull)? This issue you describe should not exist as 'stop' function sends TERM to the parent process, preventing the node from being restarted.
@reqlez : Can you check and ensure you have the latest changes (git-pull)? This issue you describe should not exist as 'stop' function sends TERM to the parent process, preventing the node from being restarted.
Yes I love your initiative, we really need to get more people on FreeBSD. I have a bit of a Twitter following so i'm going to spread the message about FreeBSD for sure :) Mind you, i'm a complete ansible novice, so I installed all of this manually LOL Maybe one day when I decide to learn ansible I will get involved with your ansible deployment method more :)
No, I understand that originally, in your code, the stop process sends term to parent, hence also sending term to cardano-node. This is why I modified your code, cardano-node DOES NOT trap TERM ( it's a bug from a year ago still ) cardano-node can only take SIGINT to shutdown safely, hence my "hack" is to send SIGINT to the cardano-node process, then send SIGTERM to the parent process.
I wonder if there is a way to somehow translate all calls for TERM into SIGINT that go out to cardano-node... but i'm not experienced enough with FreeBSD to know how to do it properly, so this is why I did this hack.
By the way, you might as well make modifications to your ansible deployment for 1.33.0 version. It's already out ( but not vetted by QC team since they are away for holidays ) but i'm pretty sure 1.33.0 will be final, I tested it and it is pretty solid. There is ONE small issue with slower CLI calls, this is the only reason I would say they will push a 1.33.1 ( if they do ).
Also, completely unrelated, but how did you get gLiveView.sh to work under FreeBSD? it complains about missing ss or lsof, then I installed lsof and when I launch gLivewView.sh it's just a black screen for a long time and then I get this:
...so I installed all of this manually
Are you running a stakepool or just relay nodes? What is your ticker?
...This is why I modified your code, cardano-node DOES NOT trap TERM ( it's a bug from a year ago still ) cardano-node can only take SIGINT to shutdown safely:
I was unaware that cardano-node did not recognize TERM and only SIGINT and SIGTERM. Is this documented somewhere?
Yes I run PSB pool! If you join Twitter i'm happy to promo ur pool as well! We can be FreeBSD buddies ;-) My core is not on FreeBSD yet, but I have 1 relay now on FreeBSD. Just going to have to figure out some quirks. Also, the FreeBSD guys saw the message I posted and they want to help me write a port for FreeBSD as well! That could be interesting, if you want to get involved with that let me know.
And yes, 100% i tested SIGINT versus SIGTERM behavior. I saw the node re-validate the chain every time I used SIGTERM, however, with SIGINT it was fast start up. So you cannot use TERM or SIGTERM, only SIGINT.
Regarding bug... yes... https://github.com/input-output-hk/cardano-node/issues/1697
Yes I run PSB pool! If you join Twitter
Cool. My handle is @swinful. However, I do not have a good looking site to promote my pool -- yet! Life got in the way, but that is one goal for 2022 is to have a decent landing page for FBSD pool.
I have 1 relay now on FreeBSD. Just going to have to figure out some quirks.
What are the quirks you are facing? Perhaps I need to make a friendly tutorial on how to launch the ansible playbook against a freebsd node to act as a relay. It takes about 45mins for the automated setup (not including time to perform a full sync).
Also, the FreeBSD guys saw the message I posted and they want to help me write a port for FreeBSD
What software/product are you looking to port? Would love to join and help out where I can.
And yes, 100% i tested SIGINT versus SIGTERM behavior. I saw the node re-validate the chain every time I used SIGTERM, however, with SIGINT it was fast start up. So you cannot use TERM or SIGTERM, only SIGINT.
Noted. I will commit your change later today and push to the repo. Thank you.
Regarding bug... yes... input-output-hk/cardano-node#1697
Thanks for sharing! I always wondered why my node took soo long to start and will test and see if this speeds things up. Currently my relay nodes take 30mins or more to start most times.
Cool! I will add you on Twitter!
About ansible tutorial: Yes, I'm going to be honest, I have 15+ years in IT infra, and I have been running servers since I was 16, and then I found your ansible scripts and I had no idea how to deploy them properly hahaha
Quirks like, for example, the node not taking forever to restart due to unsafe shutdown. Getting gLiveView to maybe actually work, figuring out why when I set the +RTS -N3 -RTS parameters in the startup command, they are not working...
The FreeBSD guys suggested making a port for cardano-node so automatic setup from ports at least for a basic relay node. They mentioned they are willing to help with the process since I have never written a port before. Then of course, I will have to maintain it, maybe some some community help.
Yes the bug is BAD. Mind you, my changes are "dirty" i'm looking into how to instead translate the SIGTERM to SIGINT somehow in the services daemon ( if that is even possible ). I know with Ubuntu that is what they do to make this work properly, but I don't know how to do this with FreeBSD yet.
About ansible tutorial: ... no idea how to deploy them properly hahaha
No worries. Will work on something tailored deploying relay node.
The FreeBSD guys suggested making a port for cardano-node so automatic setup from ports at least for a basic relay node.
I too have never made a bsd port of anything useful. Knowing bsd folks are willing to guide us would be a great learning experience. Shall I create a github cardano project for us to start? Like the port for bitcoin, I was thinking we model in a similar fashion?
What do you think?
Well, the only difference between a relay and a producer is the config change. I think if we are going to do a port... maybe a cardano-node port is fine just a single one and people can change config if they want to run producer ? or... what are you referring to.
Another area of concern, is time sync. I noticed in the beginning the time sync was not that great. When running topology updater, it did not like the block sync.
I think it's worth replacing NTPD with https://calomel.org/chrony_network_time.html
Maybe we should start another thread with "improvements" instead of posting here in the same thread RE unsafe shutdown, it's getting cluttered.
Hello. Just heads up, I had to make modifications to your service code for it to work correctly.
It seems cardano-node does not trap SIGTERM and only traps SIGINT. So what I had to do is this for the stop process:
And this for the restart process:
The sleep 1 on the stop is a little dirty, since if I set it to 2 seconds, the node actually starts again for half a second then cuts off, but at 1 second it seems to work. Not sure if code can be modified so that the daemon does not try to re-start the process if it is ended by stop command? Then the sleep can be set to longer? Not sure of another solution.
However, it seems my modification only works well if i call the service restart manually. if I reboot the whole OS, it re-checks the ledger anyways. But at least cnode can be stopped and started manually if needed manually without re-validation.