Closed Cyb3rWard0g closed 5 years ago
Hey @Cyb3rWard0g, check out here: https://github.com/Cyb3rWard0g/mordor/blob/master/environment/shire/aws/README.md in the Troubleshooting Tips section
This error happens when one of the machines fail. In order to see the machine that failed, run terraform output
. Whichever machine doesn't have an IP address is the one that failed. You can correlate that by going to when the machine was trying to build in the output, and see where it failed.
To fix this run: terraform destroy Remove terraform.state terraform.state.backup and .terraform from the /terraform/ folder Re run the initialization process: terraform init terraform apply
Let me know if the problem continues. Thanks
can you provide more details on this? You can correlate that by going to when the machine was trying to build in the output, and see where it failed.
. I saw that guac was not up. how do I correlate that info with by going to when the machine was trying to build in the output, and see where it failed.
. Also, I dont think Re run the initialization process: terraform init terraform apply
should be the fix because it keeps failing. I believe we should focus on why it is failing and no just try to rebuild it right? just making sure there are more steps to actually get to the specific issue 😄
@Cyb3rWard0g
The troubleshooting steps are not:
Re run the initialization process: terraform init terraform apply
It is:
To fix this run:
terraform destroy
Remove terraform.state terraform.state.backup and .terraform from the /terraform/ folder
Re run the initialization process: terraform init
terraform apply
Which can be found here, at the bottom of the troubleshooting tips: https://github.com/Cyb3rWard0g/mordor/tree/master/environment/shire/aws
Please try this, as I cannot reproduce this issue on my end.
To answer Like where to look and what logs to check
.
Terraform doesn't have the great ability to debug. Please check here for further context:
https://github.com/hashicorp/terraform/issues/16752
Hence why, I mentioned to run terraform output
. What this will allow you to do is to see which boxes built/which didn't. When you don't see an IP for one, that means the box didn't successfully build. You can then go back to the terminal at which you ran terraform apply
and scroll up to when that box was being build. Odds are high that the Ubuntu 16.04 didn't update properly, not allowing the script to be ran correctly. This is a 1 off and doesn't happen very frequently. Only way to fix this is to create the Apache Guac AMI. However when that happens, that would no longer make it dynamic making changes to be hard.
Again, I request you try the troubleshooting steps previously stated, as I think they will help with this issue.
Please let me know if the problem occurs again.
Thank you.
thats what I meant lol :
To fix this run:
terraform destroy
Remove terraform.state terraform.state.backup and .terraform from the /terraform/ folder
Re run the initialization process: terraform init
terraform apply
that does not fix the issue right? it just runs it again? Just making sure there are more troubleshooting steps besides destroy
, remove tfstate*, init
and apply
.
Ahh so there is not a way to look at the specific step where it failed from a terraform side. Got it!
So then we would have to come up with specific steps on how to troubleshoot the box that potentially failed. For example, if HELK fails, then I can say please take a look at /var/log/helk_install.log
for any specific details. Thats how I found out that docker-compose
was not being installed properly. remember? Anyways, thank you for the extra details. Definitely an opportunity to learn and document how to troubleshoot each box 😉 .
performing the destroy, remove tfstate*, init and apply steps at the moment. I will check which one fails and try to figure out why. Also, that will help me to provide more details on the Empire issue :)
failed again on
aws_instance.helk (remote-exec): ***********************************************************************************
aws_instance.helk (remote-exec): ** [HELK-INSTALLATION-INFO] HELK WAS INSTALLED SUCCESSFULLY **
aws_instance.helk (remote-exec): ** [HELK-INSTALLATION-INFO] USE THE FOLLOWING SETTINGS TO INTERACT WITH THE HELK **
aws_instance.helk (remote-exec): ***********************************************************************************
aws_instance.helk (remote-exec): HELK KIBANA URL: https://172.18.39.6
aws_instance.helk (remote-exec): HELK KIBANA USER: helk
aws_instance.helk (remote-exec): HELK KIBANA PASSWORD: hunting
aws_instance.helk (remote-exec): HELK SPARK MASTER UI: http://172.18.39.6:8080
aws_instance.helk (remote-exec): HELK JUPYTER SERVER URL: http://172.18.39.6/jupyter
aws_instance.helk (remote-exec): HELK JUPYTER CURRENT TOKEN: 1fb5d7a3846b4784aea5ae44b23a664870741ab894219c97
aws_instance.helk (remote-exec): HELK ZOOKEEPER: 172.18.39.6:2181
aws_instance.helk (remote-exec): HELK KSQL SERVER: 172.18.39.6:8088
aws_instance.helk (remote-exec): IT IS HUNTING SEASON!!!!!
aws_instance.helk: Creation complete after 9m29s [id=i-09a09ea96bfd2eaf2]
Error: error executing "/tmp/terraform_940696272.sh": Process exited with status 5
Joses-MacBook-Air:terraform cyb3rpandah$
Terraform output does not show GUAC
with an IP. So I should straight go to GUAC box? You mentioned: Odds are high that the Ubuntu 16.04 didn't update properly, not allowing the script to be ran correctly.
That would still be GUAC?
@Cyb3rWard0g is this from a fresh git pull
? I rebuilt the lab earlier, but doing it again right now
Please do this:
terraform destroy
Run:
sudo rm terraform.tfstate.backup
sudo rm terraform.tfstate
Then
sudo terraform init
sudo terraform apply
Also yes, if the Guacamole box doesn't have an IP, it means the Apache Guacamole script didn't run correctly: https://github.com/jsecurity101/ApacheGuacamole
@jsecurity101 thats exactly what I did
performing the destroy, remove tfstate*, init and apply steps at the moment
I am not sure if you saw one of my last messages.
Ok so how do I troubleshoot Guac? Any specific log locations to understand what failed specifically? or you recommend to run the script again? the guac bash script
@Cyb3rWard0g Sorry, my browser didn't refresh to see that.
Just rebuilt the lab and it succeeded. No errors. This could be an internet connectivity issue as well.
Any specific log locations to understand what failed specifically? or you recommend to run the script again? the guac bash script
Yes Go to the Guac box:
git pull https://github.com/jsecurity101/ApacheGuacamole.git
cd ApacheGuacamole
sudo bash ApacheGuacamole.sh
You will then have to change the user-mapping.xml
(Path: /etc/guacamole/user-mapping.xml) to Mordor's: https://github.com/Cyb3rWard0g/mordor/blob/master/environment/shire/aws/scripts/ApacheGuacamole/user-mapping.xml
Then run:
sudo service tomcat7 restart
ahhh got it thank you very much man. I will do that after dinner :) I appreciate it
I sshd to the guac box and followed these steps:
It failed here:
guac@ip-172-18-39-9:~$ sudo apt-get install libcairo2-dev libjpeg62-dev libpng12-dev libossp-uuid-dev libfreerdp-dev libpango1.0-dev libssh2-1-dev libssh-dev tomcat7 tomcat7-admin tomcat7-user -y
sudo: unable to resolve host ip-172-18-39-9
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package libjpeg62-dev is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
libjpeg-turbo8-dev
Package libcairo2-dev is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package 'libcairo2-dev' has no installation candidate
E: Package 'libjpeg62-dev' has no installation candidate
E: Unable to locate package libossp-uuid-dev
E: Unable to locate package libpango1.0-dev
E: Couldn't find any package by glob 'libpango1.0-dev'
E: Couldn't find any package by regex 'libpango1.0-dev'
I ran a sudo apt-get update
and then ran the same line https://github.com/Cyb3rWard0g/mordor/blob/2cd595efb69c2f9a35935724fb70079f80c7bc2c/environment/shire/aws/terraform/main.tf#L218 and it installed all the packages.
Then I followed this step which ran successfully:
One thing that I noticed right away is that you repeat the installation of libraries:
apt-get install libcairo2-dev libjpeg62-dev libpng12-dev libossp-uuid-dev libfreerdp-dev libpango1.0-dev libssh2-1-dev libssh-dev tomcat7 tomcat7-admin tomcat7-user -y
updated user-mapping and restarted tomcat7 service
It all seems to be working now. The only thing that failed was https://github.com/Cyb3rWard0g/mordor/blob/2cd595efb69c2f9a35935724fb70079f80c7bc2c/environment/shire/aws/terraform/main.tf#L218 . I wonder if an apt-get update
might help there.
"I wonder if an apt-get update might help there."
This is already being done:
"One thing that I noticed right away is that you repeat the installation of libraries:" This is a check to make sure that the libraries are correctly installed
mmmm interesting then it fails all the time. Welp. I guess we will never know. I will try it when I get back to the states. Thank you for your time and help. There should be definitely be some steps to troubleshoot each box just in case. thats a future docs update.
I got the following error when building the environment in the cloud via terraform:
I was able to RDP to the windows boxes and use them properly, but something must have not completed successfully. Would you mind providing a few steps to troubleshoot the build when it does not install properly. Like where to look and what logs to check. thank you!