Open peterclemenko opened 5 years ago
current docker stack services, it's been like this since I tried 30 minutes ago
PS D:\dev\WALKOFF> docker stack services walkoff ID NAME MODE REPLICAS IMAGE PORTS
2tlevhxrh2gy walkoff_app_nmap replicated 0/0 127.0.0.1:5000/walkoff_app_nmap:1.0.0
42kbvqsr6k12 walkoff_app_ssh replicated 0/0 127.0.0.1:5000/walkoff_app_ssh:1.0.0
8hwek6bhek9m walkoff_resource_registry replicated 1/1 registry:2
:5000->5000/tcp
94voa2ik6obx walkoff_core_worker replicated 0/0 127.0.0.1:5000/worker:latest
aeijazmi07bc walkoff_resource_nginx replicated 0/1 bitnami/nginx:latest :8080->8080/tcp
b29hekf4ye1p walkoff_resource_portainer replicated 1/1 portainer/portainer:latest
dxs2sed9x4g8 walkoff_app_walk_off replicated 0/0 127.0.0.1:5000/walkoff_app_walk_off:1.0.0
fg9e4fwh588b walkoff_resource_redis replicated 1/1 redis:latest
*:6379->6379/tcp
ik51fiymgcmo walkoff_app_hive replicated 0/0 127.0.0.1:5000/walkoff_app_hive:1.0.0
k0qnoehvirhl walkoff_app_ip_addr_utils replicated 0/0 127.0.0.1:5000/walkoff_app_ip_addr_utils:1.0.0
morcj465ozoi walkoff_app_sdk replicated 0/0 127.0.0.1:5000/walkoff_app_sdk:latest
mzx14hl1o69k walkoff_resource_minio replicated 1/1 minio/minio:latest
olaf1aiuub54 walkoff_core_umpire replicated 0/1 127.0.0.1:5000/umpire:latest pdbkm6dg1fg8 walkoff_app_adversary_hunting replicated 0/0 127.0.0.1:5000/walkoff_app_adversary_hunting:1.0.0 rbxro4kdtarp walkoff_app_mitre_attack replicated 0/0 127.0.0.1:5000/walkoff_app_mitre_attack:1.0.0 snu26p7pj7uv walkoff_app_basics replicated 0/0 127.0.0.1:5000/walkoff_app_basics:1.0.0 t3kr4y37yyvl walkoff_app_power_shell replicated 0/0 127.0.0.1:5000/walkoff_app_power_shell:1.0.0 unxd9lyay47v walkoff_core_api_gateway replicated 0/1 127.0.0.1:5000/api_gateway:latest y4d2ch0fi1mf walkoff_resource_postgres replicated 1/1 postgres:latest
Walkoff appears to have been broken on Windows since beta 1 was released. Can this please be fixed so it can be stable tagged as working on Windows? I'm trying to build an app that uses Walkoff as the automation layer and even if I can manage to get it working temporarily, it's unshippable if there is no stable dependency that runs.
Does this still tie in with the last issue? We had changed the windows setup to mirror the linux setup so try using the walkoff.ps1 installation. The installation instructions need to be updated and the old windows_setup.ps1 need to be deleted apologies for the oversight. You should be able to use the bootloader using the same commands at the Linux version if you use walkoff.ps1. So for example walkoff.ps1 up --build
and walkoff.ps1 down
should work. Let us know if this solves the problem or its something else.
I'm not sure if this ties in to the last issue or not. walkoff.ps1 isn't working. This report is based on walkoff.ps1
Looking at the logs from https://github.com/nsacyber/WALKOFF/issues/239, I wonder if it's a symptom of the same issue.
The logs from #239 are due to running walkoff_setup.ps1 instead of walkoff.ps1 (there was an errant :5000 left over in the command retrieving the host's DockerNAT IP) - I'll push that deletion to master soon.
For this one, could you post the results of docker service ps service_name --no-trunc
for the services that are at 0/1?
walkoff_resource_nginx
walkoff_core_umpire
walkoff_core_api_gateway
Additionally, are you using Docker for Windows (Hyper-V backend) or the older Docker Toolbox for Windows (Virtualbox backend)? @iadgovuser11 and I are using the former for testing on Windows. I can try and get the latter set up to try and replicate if that is the case.
I'm on the hyper v backend.
walkoff_resource_nginx docker service ps walkoff_resource_nginx --no-trunc ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS ezxfwezamx9gzodwktx5c1rtz walkoff_resourcenginx.1 bitnami/nginx:latest@sha256:a0be1c01c8bd56d9609b612eb8052c1b1d1f6ec44ef904f666567f5644b6eb63 docker-desktop Ready Ready less than a second ago put0jh8dsrn08m969nhac182y \ walkoff_resourcenginx.1 bitnami/nginx:latest@sha256:a0be1c01c8bd56d9609b612eb8052c1b1d1f6ec44ef904f666567f5644b6eb63 docker-desktop Shutdown Failed less than a second ago "task: non-zero exit (1)" bo52v029cykqw7m7ivi9v3cgx \ walkoff_resourcenginx.1 bitnami/nginx:latest@sha256:a0be1c01c8bd56d9609b612eb8052c1b1d1f6ec44ef904f666567f5644b6eb63 docker-desktop Shutdown Failed 10 seconds ago "task: non-zero exit (1)" xmkyx09lypu9zeqweh8pzfjrg \ walkoff_resourcenginx.1 bitnami/nginx:latest@sha256:a0be1c01c8bd56d9609b612eb8052c1b1d1f6ec44ef904f666567f5644b6eb63 docker-desktop Shutdown Failed 19 seconds ago "task: non-zero exit (1)" v5a05wzb4k2t7kme97u82lvbz \ walkoff_resource_nginx.1 bitnami/nginx:latest@sha256:a0be1c01c8bd56d9609b612eb8052c1b1d1f6ec44ef904f666567f5644b6eb63 docker-desktop Shutdown Failed 28 seconds ago "task: non-zero exit (1)"
docker service ps walkoff_core_umpire --no-trunc ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 61wf1nx59lxu54r49smrztlou walkoff_coreumpire.1 127.0.0.1:5000/umpire:latest docker-desktop Ready Rejected less than a second ago "No such image: 127.0.0.1:5000/umpire:latest" pec5rstjc0m2qiqsfmw5i0w20 \ walkoff_coreumpire.1 127.0.0.1:5000/umpire:latest docker-desktop Shutdown Rejected 5 seconds ago "No such image: 127.0.0.1:5000/umpire:latest" 3xyqh9ii3nhqhfii4mwkxc3vc \ walkoff_coreumpire.1 127.0.0.1:5000/umpire:latest docker-desktop Shutdown Rejected 10 seconds ago "No such image: 127.0.0.1:5000/umpire:latest" wnokul17b0egng3kmo3ez82j1 \ walkoff_coreumpire.1 127.0.0.1:5000/umpire:latest docker-desktop Shutdown Rejected 15 seconds ago "No such image: 127.0.0.1:5000/umpire:latest" kmuy9yeq5ky6wshg15u0fxr0n \ walkoff_core_umpire.1 127.0.0.1:5000/umpire:latest docker-desktop Shutdown Rejected 20 seconds ago "No such image: 127.0.0.1:5000/umpire:latest"
docker service ps walkoff_core_api_gateway --no-trunc ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS x3v1jpdui1v4vunn8b1gf9b3h walkoff_core_api_gateway.1 127.0.0.1:5000/api_gateway:latest docker-desktop Ready Rejected 2 seconds ago "No such image: 127.0.0.1:5000/apigateway:latest" 6bd0icaf3bo2chu45r6r4lews \ walkoff_core_api_gateway.1 127.0.0.1:5000/api_gateway:latest docker-desktop Shutdown Rejected 7 seconds ago "No such image: 127.0.0.1:5000/apigateway:latest" fa92ylbubtrimcm36ktbslh1g \ walkoff_core_api_gateway.1 127.0.0.1:5000/api_gateway:latest docker-desktop Shutdown Rejected 12 seconds ago "No such image: 127.0.0.1:5000/apigateway:latest" o4x61wrox95ru5uutj1dgz5cx \ walkoff_core_api_gateway.1 127.0.0.1:5000/api_gateway:latest docker-desktop Shutdown Rejected 17 seconds ago "No such image: 127.0.0.1:5000/apigateway:latest" ox1im3tm26cjs2nhfibj5tejd \ walkoff_core_api_gateway.1 127.0.0.1:5000/api_gateway:latest docker-desktop Shutdown Rejected 22 seconds ago "No such image: 127.0.0.1:5000/api_gateway:latest"
Not sure what's breaking these
Try walkoff.ps1 up --build
it should re-download and build the images that are missing
Already tried that with no luck.
Interestingly enough, walkoff.ps1 down seems to be rebuiliding everything and not working either. Seems like down isn't actually executing, and executes the same as up
Could you provide the output for what you're seeing? I'm unable to replicate that behavior on Windows.
Yeah, correction on the down one, that was my misreading it. Sorry, I'm out of it today.
Build didn't fix the problem though.
While the walkoff_resource_registry
service is running, In PowerShell, can you run curl http://localhost:5000/v2/_catalog | Select-Object -ExpandProperty Content
and send the output?
Additionally, the images that the docker service ps
gave you, are those present when you do docker images
?
I'm in the process of doing it. Also, I notice the following at the end of every time I start walkoff.ps1 with up.
koff_app_mitre_attack (id: mrf5mtjiiaet39pud0e21ql57) 2019-09-06 06:44:12,578 - BOOTLOADER - INFO:Updating service walkoff_core_worker (id: bj9ojho6jqgt1qojpbfgfaur0) 2019-09-06 06:44:12,578 - BOOTLOADER - INFO:Walkoff stack deployed, it may take a little time to converge. Use 'docker stack services walkoff' to check on Walkoff services. Web interface should be available at 'https://127.0.0.1:8080' once walkoff_resource_nginx is up. 2019-09-06 06:44:12,579 - UMPIRE - INFO:Docker connection closed.
could the docker connection closing cause part of this?
the docker connection closed is standard bootloader output. It just means the bootloader connection closed.
PS D:\dev\WALKOFF> docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5c9440f958bd portainer/portainer:latest "/portainer" 30 seconds ago Up 24 seconds 9000/tcp walkoff_resource_portainer.1.a5c7lkbwiva1ns80y5x87mvz1 50fdd5734b62 minio/minio:latest "/usr/bin/docker-ent…" 8 hours ago Up 8 hours 9000/tcp walkoff_resource_minio.1.fcx6bn3sbkf9pzvw4gbcurhrs 709b0e95aecb postgres:latest "docker-entrypoint.s…" 8 hours ago Up 8 hours 5432/tcp walkoff_resource_postgres.1.7z2sb45n6qnqzsw324ih56nqf 1de883ae9e36 registry:2 "/entrypoint.sh /etc…" 8 hours ago Up 8 hours 5000/tcp walkoff_resource_registry.1.ui6p0ixrjvkpy1zmwgep5cw2o 54a534a9a865 redis:latest "docker-entrypoint.s…" 8 hours ago Up 8 hours 6379/tcp walkoff_resource_redis.1.y90ai8uv04higx53h59laq6pp f8170beee17c prismagraphql/prisma:1.34 "/bin/sh -c /app/sta…" 8 days ago Up 8 hours 0.0.0.0:4466->4466/tcp nestjs-prisma-starter_prisma_1 20b702f49f19 mongo:3.6 "docker-entrypoint.s…" 8 days ago Up 8 hours 0.0.0.0:27017->27017/tcp nestjs-prisma-starter_mongo_1
PS D:\dev\WALKOFF> curl http://localhost:5000/v2/_catalog | Select-Object -ExpandProperty Content % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed
100 37 100 37 0 0 37 0 0:00:01 --:--:-- 0:00:01 787
Select-Object : Property "Content" cannot be found.
At line:1 char:42 + ... ://localhost:5000/v2/_catalog | Select-Object -ExpandProperty Content + ~~~~~~~~~ + CategoryInfo : InvalidArgument: ({"repositories":["walkoff_app_sdk"]}:PSObject) [Select-Object], PSArgumentException
PS D:\dev\WALKOFF> docker images REPOSITORY TAG IMAGE ID CREATED SIZE bitnami/nginx
So it looks like it's failing somewhere in the build process - only walkoff_app_sdk is showing up. When you do walkoff.ps1 up --build --debug
what output do you see?
This is the build output out.txt
I attached the build output above
Sorry, I'm kind of at a loss. Nothing in that build output suggests that it's having trouble building or pushing, but your docker images
and curl
output show that all of the images are missing except app_sdk. Was the behavior the same (docker service ps showing missing images) after that latest build?
Is there anything that might be cleaning up images on your system? If you manually build something and then push it to WALKOFF's registry while it's running, does it show up in docker images
and the registry catalog? For example:
# Inside the WALKOFF directory
docker build -f api_gateway/Dockerfile -t 127.0.0.1:5000/api_gateway .
docker push 127.0.0.1:5000/api_gateway
.\walkoff.ps1 down --debug
.\walkoff.ps1 up --debug
So doing that I now have the nginx and web interface
So it's working now. I'm not sure if I should close the issue or not, but my gut tells me there's a deeper issue here.
So I just did a clean install and I had to do the manual build and push again. Shouldn't this be automated?
I just moved my development environment out of a Ubuntu VM and into Windows. Even on a clean install it still builds and pushes as expected in the walkoff.ps1 script, so I'm not really sure what the issue might be on your end. We're wrapping up some efforts for a release over the next few days, I'll get back to this once we release.
Alright, I'm going to look in to this on my end. A hack might be needed, if I can fix it on my end, I'll send a pull request.
Also, I have a theory that #245 might be related to this as well. If something is screwed up with the way it works on my machine (which is weird because docker usually works, but for some reason it stopped working right around the time the bootloader got implemented), it might be causing some kind of issue with how the push is handled. I don't know what would cause it though, as I'm running the latest docker for windows in a relatively vanilla windows 10.
Wait a second, just tested development branch and it seems to work out of the box.
I'm going to close this issue for now. #245 seems to be having similar issues but for everything but bootloader and the api gateway. I have no idea why.
Wait, I think I was wrong.
docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE walkoff_bootloader latest 7660c1f0526e About a minute ago 427MB
Yeah, I was wrong, it's still broken. I need to tear in to this.
Out of curiosity, what version of windows are you running, is it the latest patch set? Also are you running the latest docker for windows. I'm doing all that, also I'm running it from both my d drive and my desktop.
I use this to set up my desktop when I create a new one. I do add a few things after that (mainly games), but this is the bulk of it. https://github.com/aoighost/newwin10installscript
so on development I ran up and after the script did it's thing, this is the output of docker stack services walkoff 17j07lxx4ytn walkoff_resource_redis replicated 1/1 redis:latest :6379->6379/tcp 54g24pk0gbnn walkoff_app_nmap replicated 0/0 127.0.0.1:5000/walkoff_app_nmap:1.0.0 5jfoqi1cw0cv walkoff_app_sdk replicated 0/0 127.0.0.1:5000/walkoff_app_sdk:latest 5pflldvhq6a7 walkoff_resource_registry replicated 1/1 registry:2 :5000->5000/tcp 7oivhiuyoesc walkoff_app_basics replicated 0/0 127.0.0.1:5000/walkoff_app_basics:1.0.0 9g8vux7la7e4 walkoff_app_ip_addr_utils replicated 0/0 127.0.0.1:5000/walkoff_app_ip_addr_utils:1.0.0 cigjqalgf9a3 walkoff_resource_portainer replicated 1/1 portainer/portainer:latest crewnpyik1ye walkoff_app_ssh replicated 0/0 127.0.0.1:5000/walkoff_app_ssh:1.0.0 jc68iuqbz3sl walkoff_app_power_shell replicated 0/0 127.0.0.1:5000/walkoff_app_power_shell:1.0.0 m50dffhmdpjl walkoff_app_adversary_hunting replicated 0/0 127.0.0.1:5000/walkoff_app_adversary_hunting:1.0.0 q11jaz1pkia6 walkoff_core_umpire replicated 0/1 127.0.0.1:5000/umpire:latest qkx0kyiau1b8 walkoff_core_api_gateway replicated 0/1 127.0.0.1:5000/api_gateway:latest umlfwm87jlda walkoff_resource_postgres replicated 1/1 postgres:latest vixnw1v2zur2 walkoff_resource_minio replicated 1/1 minio/minio:latest xipvr7vdel5q walkoff_core_worker replicated 0/0 127.0.0.1:5000/worker:latest xv9ny5e05qpe walkoff_resource_nginx replicated 0/1 bitnami/nginx:latest *:8080->8080/tcp ykdrhvg76sly walkoff_app_walk_off replicated 0/0 127.0.0.1:5000/walkoff_app_walk_off:1.0.0 yqjyqj9pc23e walkoff_app_hive replicated 0/0 127.0.0.1:5000/walkoff_app_hive:1.0.0 yx7ypavpt111 walkoff_app_mitre_attack replicated 0/0 127.0.0.1:5000/walkoff_app_mitre_attack:1.0.0
So on latest master and dev I've been able to replicate this and umpire is doing the same thing as gateway.
I've done this on two windows 10 boxes. A desktop and a laptop. The laptop I can at least work around it by manually pushing, but the desktop manually pushing gives me an internal server error.
Are you still experiencing the same issue? To follow up from a few months ago, it was tested on three machines, one on 1809, one on the latest build of Win10 at the time, and another on the fast ring insider build of Win10 20H1 at the time. All had the latest version of Docker for Windows at the time.
Not trying to deflect but it still feels like the issue is more likely to be in configuration of Docker or network settings.
My debug steps would be:
docker stack services walkoff
for failing servicesdocker service ps servicename --no-trunc
for the exit codes for a failing servicedocker service logs -f servicename
for the logs on a failing service.Can only really get an idea of what's going on with all three of those from a particular failed deployment.
So I was able to get it to work on Linux, however it appears the latest builds are not properly building using the ps1 script in git. It actually errors out when trying to run it on my end. First it refuses to create walkoff network, then it refuses to build other images and stand them up. It seems that the core images are the ones not loading up. I can test again and post logs, I'll open another issue for it.
I have no idea what's causing it. It's even happening on a fresh windows 10 install for me with nothing but the stuff in the install script here: https://github.com/aoighost/newwin10installscript
using windows terminal and powershell core from the script
I know you're not trying to deflect, it seems like it works for everyone but me. If I can find out what's causing it, it would make my life a lot easier. For now though I'm exploring kubernetes and argo as an alternative for walkoff for my project. I can't keep putting off dev for this.
Hi @aoighost, I'm running WALKOFF on CentOS and I just ran into the same issue as you. There's no way I can explain it and it drove me crazy for days! Clean install over and over again and I just couldn't get it to work. Exact same configuration somewhere else and it works.
Thanks to all of your investigation, I fixed it following your steps :
docker build -f umpire/Dockerfile -t 127.0.0.1:5000/walkoff_core_umpire .
docker push 127.0.0.1:5000/walkoff_core_api
docker build -f umpire/Dockerfile -t 127.0.0.1:5000/walkoff_core_umpire .
docker push 127.0.0.1:5000/walkoff_core_umpire
docker build -f socketio/Dockerfile -t 127.0.0.1:5000/walkoff_core_socketio .
docker push 127.0.0.1:5000/walkoff_core_socketio
./walkoff.sh up --build
I cannot explain it either. It was working fine the day before, the next day I couldn't anymore. Well, if anybody knows someday.. anyway, thanks to all of you!
I faced a similiar issue on Debian 9.
0p4ydflf0zsr walkoff_core_socketio replicated 0/1 127.0.0.1:5000/walkoff_core_socketio:latest *:3000->3000/tcp
4j2m9mncenvv walkoff_core_umpire replicated 0/1 127.0.0.1:5000/walkoff_core_umpire:latest
6wg942wn5pq1 walkoff_core_api replicated 0/1 127.0.0.1:5000/walkoff_core_api:latest
lrxe885j5fln walkoff_resource_nginx replicated 0/1 bitnami/nginx:latest *:8080->8080/tcp
Nginx was failing on
2020/08/27 09:11:59 [emerg] 1#0: host not found in upstream "core_api" in /opt/bitnami/nginx/conf/server_blocks/walkoff.conf:10
nginx: [emerg] host not found in upstream "core_api" in /opt/bitnami/nginx/conf/server_blocks/walkoff.conf:10
I ran
docker build -f umpire/Dockerfile -t 127.0.0.1:5000/walkoff_core_umpire .
docker build -f api/Dockerfile -t 127.0.0.1:5000/walkoff_core_api .
docker build -f socketio/Dockerfile -t 127.0.0.1:5000/walkoff_core_socketio .
./walkoff.sh up
core_api was then failing and another bug, but not directly related - https://github.com/nsacyber/WALKOFF/pull/272
Rebuilding the images helped 🎉 Thanks!
So I've moved to Shuffle which supports walkoff apps and is actively developed, in case anyone is looking for help with this
@aoighost see https://github.com/nsacyber/WALKOFF/pull/273
This fixed the problem for me.
On the latest master Nginx isn't starting on windows