chadgeary / nifi

Deploy a secured, clustered, auto-scaling NiFi service in AWS.
48 stars 14 forks source link

Ubuntu and Rhel7 need maintenance #3

Closed pratibhadeepti closed 3 years ago

pratibhadeepti commented 3 years ago

Hi, @chadgeary Getting state manger status as failed and when I check ssm logs, getting this error:


Mar 19 19:20:51 tf-nifi-zookeeper-2 sudo[11481]: pam_unix(sudo:session): session opened for user nifi by (uid=0)
Mar 19 19:20:51 tf-nifi-zookeeper-2 python3[11483]: ansible-aws_s3 Invoked with mode=get bucket=nifi-bucket-123 object=/nifi/downloads/zookeeper.tar.gz dest=/opt/
Mar 19 19:20:51 tf-nifi-zookeeper-2 python3[11483]: ansible-aws_s3 [WARNING] Module remote_tmp /home/nifi/.ansible/tmp did not exist and was created with a mode o
Mar 19 19:20:51 tf-nifi-zookeeper-2 sudo[11481]: pam_unix(sudo:session): session closed for user nifi 
chadgeary commented 3 years ago

Hey - I am in the process of moving this code around and re-organizing. As far as I know, RHEL should work at the moment but Ubuntu is not done. Apologies.

On Fri, Mar 19, 2021 at 5:04 PM PratibhaDeepti @.***> wrote:

Hi, @chadgeary https://github.com/chadgeary Getting state manger status as failed and when I check ssm logs, getting this error:

Mar 19 19:20:51 tf-nifi-zookeeper-2 sudo[11481]: pam_unix(sudo:session): session opened for user nifi by (uid=0) Mar 19 19:20:51 tf-nifi-zookeeper-2 python3[11483]: ansible-aws_s3 Invoked with mode=get bucket=nifi-bucket-123 object=/nifi/downloads/zookeeper.tar.gz dest=/opt/ Mar 19 19:20:51 tf-nifi-zookeeper-2 python3[11483]: ansible-aws_s3 [WARNING] Module remote_tmp /home/nifi/.ansible/tmp did not exist and was created with a mode o Mar 19 19:20:51 tf-nifi-zookeeper-2 sudo[11481]: pam_unix(sudo:session): session closed for user nifi

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chadgeary/nifi/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFRMOT76WK6EEEWM2R4X3TEO36VANCNFSM4ZPQFHPQ .

pratibhadeepti commented 3 years ago

Hey @chadgeary tried Rhel as well but that's also isn't working, can you please check that as its really important for us.

chadgeary commented 3 years ago

Hey @pratibhadeepti , rhel7 has been fixed in commit 1b2d607

I'll see about setting some time up to do ubuntu as well.

Cheers

chadgeary commented 3 years ago

Whoops, did not mean to close this. Will close when Ubuntu is good to go.

chadgeary commented 3 years ago

0f80f07 and 6815c28

fixed and tested successfully for ubuntu and rhel, revamped quite a bit of the codebase. Enjoy!

pratibhadeepti commented 3 years ago

Thank you so much @chadgeary for taking out time and revamping the code base, atleast this time I was able to create resources and made changes in case if we already have vpc and subnets created, I'm facing one issue, whenever I'm hitting elb-loadbalancer dns sometimes I'm getting Insufficient permissions errors and after hitting multiple times Nifi UI opens, why so, it would be great if you could let me know what fix I can try to resolve it and how we are handling Nifi Failover, it would be much appreciated.

Thanks Screenshot 2021-04-21 155238 nifi01

chadgeary commented 3 years ago

I'm not sure what version of the codebase you're using, but first be sure to grab the latest commit - I worked out a lot of bugs!

Second, if you have an untrusted proxy this is because the node did not join the cluster properly - this could be because of a timeout problem, or the older codebase.

As for your other image with the arrows - are you telling me you cannot click on those? That is a default nifi setting - you (admin) need to grant yourself permission to modify processes, etc. Start by right clicking the blank space and editing the policies - there's a hard to focus on line called "Create new" policy or something similar. It does not stand out.

I will have a video up, maybe this weekend, where I go through deployment, scaling up/down, and setting up nifi for the first time.

Regards

chadgeary commented 3 years ago

If you want to try out the latest version, I've considerably re-written RHEL's implementation to use NLBs. This also provides a way to expose service ports to clients and auto-heal zookeeper nodes. The issue with certificates and untrusted proxies is also taken care of.

Additionally, a lambda function fetches the NiFi software. This speeds up bootstrapping considerably!

chadgeary commented 3 years ago

I should mention the root cause of untrusted proxy - there is a permission policy in nifi for trusted proxies, we only know (and therefore) and define the zookeeper nodes initially in authorizers.xml.

To fix this, there are two options

  1. Add the new nodes via the WebUI (or possibly API) as trusted proxy entities. This isn't easy to do if you're hitting the 'Untrusted Proxy' error.
  2. Separate the WebUI load balancer from the services load balancer. This is what I did in the new code base.

Cheers

chadgeary commented 3 years ago

And I was wrong yet again - turns out there's a policy 'Proxy User Requests' that allows a node to fetch/present data to a user. There is now an API call made when nodes scale up that grants them that policy.

The zookeepers are given that policy at cluster initialization.