microsoft / vision-ai-developer-kit

Vision AI Developer Kit Preview
MIT License
131 stars 68 forks source link

[bug] Installing firmware v0.5280_Perf does not startup with deployment.template.json modules file / Restricted password characters #267

Closed V4A001 closed 4 years ago

V4A001 commented 4 years ago
  1. Latest firmware v0.5280_Perf
  2. Connecting wifi works first time. But then seems IoT hub is not responding properly. So reconnect..However devices go stealth: connecting IoT hub (green 3x blinks) not working. Hanging on 'Initial page ...'

image

V4A001 commented 4 years ago

I did hold reset for 12 seconds at least. Then used another wifi network and seems to run now. Let try the other device. image Now waiting till the 3 simultanous blinks turn into continous 3 leds.

V4A001 commented 4 years ago

Waiting for minutes now. Seems that the containers are not started. docker ps does not show anything.

/ # iotedge check Configuration checks

√ config.yaml is well-formed × config.yaml has well-formed connection string Invalid connection string format detected. Please check the value of the provisioning.device_connection_string parameter. √ container engine is installed and functional × config.yaml has correct hostname config.yaml has hostname localhost but device reports hostname qcs605-32 × config.yaml has correct URIs for daemon mgmt endpoint Unable to find image 'mcr.microsoft.com/azureiotedge-diagnostics:1.0.7' locally 1.0.7: Pulling from azureiotedge-diagnostics 9d34ec1d9f3e: Pulling fs layer 8c5f1a821a85: Pulling fs layer 8c5f1a821a85: Verifying Checksum 8c5f1a821a85: Download complete 9d34ec1d9f3e: Verifying Checksum 9d34ec1d9f3e: Download complete 9d34ec1d9f3e: Pull complete 8c5f1a821a85: Pull complete Digest: sha256:72b13af52ea605cd9eb628b476f4ec353ef8a17fd94b475ca4086d40bfa103ad Status: Downloaded newer image for mcr.microsoft.com/azureiotedge-diagnostics:1.0.7 Error: could not execute list-modules request: an error occurred trying to connect: Connection refused (os error 111) ‼ latest security daemon Installed IoT Edge daemon has version 1.0.7 but version 1.0.8 is available. Please see https://aka.ms/iotedge-update-runtime for update instructions. √ host time is close to real time √ container time is close to host time ‼ DNS server Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub. Please see https://aka.ms/iotedge-prod-checklist-dns for best practices. You can ignore this warning if you are setting DNS server per module in the Edge deployment. ‼ production readiness: certificates Device is using self-signed, automatically generated certs. Please see https://aka.ms/iotedge-prod-checklist-certs for best practices. × production readiness: certificates expiry Could not enumerate files under /var/lib/iotedge/hsm/certs √ production readiness: container engine √ production readiness: logs policy

Connectivity checks

× Edge Hub can bind to ports on host Could not check current state of Edge Hub container

One or more checks raised errors. Re-run with --verbose for more details.

V4A001 commented 4 years ago

This device has an empty configuration for the edge agent. Please set a deployment manifest. I added DNS 1.1.1.1 and then Google's DNS 8.8.8.8. Now seems only the agents runs, all docker images are there, however they do not start. Next error:

417 - The device's deployment configuration is not set

It seems with cmd 'docker ps ' on adb that only the agent is running and IoT is not started as no deployment is there. I believed that is done through Visual Studio Code.

I resolved it by using this from another post:

The edge hub module is started when there is a deployment. The deployment may be empty. (For example, if you go to the IoT Edge device on Azure portal, select "SetModules" and just hit "next" until you get to the "submit" button will create a deployment with no other modules, but edgeAgent and edgeHub will be created, and you will not get the "417" error status.

V4A001 commented 4 years ago

Is there a way to push this DNS configuration by default to the device or is there a reason it is not there by default?

V4A001 commented 4 years ago

After deploying new docker containers devices loses its connectivity again...firmware upgrade for firth time? dockers containers run fine if I check with docker images and docker ps

/ # ping 1.1.1.1 connect: Network is unreachable

My dns settings:

"systemModules": { "edgeAgent": { "type": "docker", "settings": { "image": "mcr.microsoft.com/azureiotedge-agent:1.0.8", "createOptions": "{\"HostConfig\":{\"Dns\":[\"1.1.1.1\"]}}" } }, "edgeHub": { "type": "docker", "status": "running", "restartPolicy": "always", "settings": { "image": "mcr.microsoft.com/azureiotedge-hub:1.0", "createOptions": "{\"HostConfig\":{\"Dns\":[\"1.1.1.1\"],\"PortBindings\":{\"5671/tcp\":[{\"HostPort\":\"5671\"}],\"8883/tcp\":[{\"HostPort\":\"8883\"}],\"443/tcp\":[{\"HostPort\":\"443\"}]}}}" }

PuneetRahejaMS commented 4 years ago

@V4A001 Can you try a factory/hard reset and try again.

V4A001 commented 4 years ago

I have deployed firmware v0.4940_Perf back on the device instead of v0.5280_Perf. Having same connectivity issues. After firmware upgrade I can ping, but after reboot I cannot anymore. This happens with default docker images loaded by the manifest of the firmware and with the sample build in an own deployment manifest.

=== / # docker pull mcr.microsoft.com/aivision/visionsamplemodule:1.1.3-arm32v7 Error response from daemon: Get https://mcr.microsoft.com/v2/: tls: failed to parse certificate from server: asn1: syntax error: invalid boolean / # ping 1.1.1.1 connect: Network is unreachable / #

jkubicka commented 4 years ago

@V4A001 Wi-Fi connectivity is a known issue with v0.4940, please update the firmware to v0.5280 and complete setup.

Then if you're still having Wi-Fi issues, try the command adb shell connectap < Wi-Fi network name > none -1

Are you on a Corp Wi-Fi? Could be a firewall issue.

Devinwong commented 4 years ago

@V4A001 , it seems that you were not able to connect to your WiFi automatically after reboot. If you had gone through the OOBE setup webpages, the profile should had been saved in the file /data/misc/websettings/setting.conf. You can take a look at this file and see if the SSID is correct. Note that the passphrase is encrypted. And you can try what Jan mentioned about with the connectap command.

V4A001 commented 4 years ago

/ # cat etc/version v0.5280_Perf / # cat /data/misc/websettings/setting.conf {"connect_ap":{"ssid":"CISCO-GUESTS","password":"xx"},"username":"device_admin","password":"$xx","io_t_connect_string":"HostName=xx;DeviceId=xx;SharedAccessKey=xx","ssh_enable":true,"user_dns":{"enabled":true,"value":"8.8.8.8"}}/ # / # connectap cisco-guests none -1 [Connect AP] SSID=cisco-guests, key=none, Security=-1 wlan.ko exists! kill wpa_supplicant! kill udhcpc! [Scan AP] Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

OK tempbuff is less than 20 character / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. From 192.168.10.135 icmp_seq=1 Destination Host Unreachable ^C --- 1.1.1.1 ping statistics --- 2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 1001ms

/ # reboot / # (base) C:\WINDOWS\system32>adb shell / # ping 1.1.1.1 connect: Network is unreachable / # cat /data/misc/websettings/setting.conf {"connect_ap":{"ssid":"CISCO-GUESTS","password":"xx"},"username":"device_admin","password":"xx","io_t_connect_string":"HostName=xx;DeviceId=xx;SharedAccessKey=xx","ssh_enable":true,"user_dns":{"enabled":true,"value":"8.8.8.8"}}/ #

V4A001 commented 4 years ago

/ # ping 1.1.1.1 connect: Network is unreachable / # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2ca11f47c8ed mcr.microsoft.com/azureiotedge-hub:1.0.8.3 "/bin/sh -c 'echo \"$…" 11 minutes ago Up 34 seconds 0.0.0.0:443->443/tcp, 0.0.0.0:5671->5671/tcp, 0.0.0.0:8883->8883/tcp edgeHub 363673996bf6 mymodule.azurecr.io/aivisiondevkitgetstartedmodule:0.0.3-arm32v7 "python3 -u ./main.py" 11 minutes ago Up 37 seconds AIVisionDevKitGetStartedModule 47fa807f5db2 mymodule.azurecr.io/webstreammodule:0.0.21-arm32v7 "docker-entrypoint.s…" 14 minutes ago Up 33 seconds WebStreamModule 78ca252c2af3 mcr.microsoft.com/azureiotedge-agent:1.0.8.3 "/bin/sh -c 'echo \"$…" 15 minutes ago Up 52 seconds edgeAgent / # shell connectap cisco-guests none -1 /bin/sh: shell: not found / # connectap cisco-guests none -1 [Connect AP] SSID=cisco-guests, key=none, Security=-1 wlan.ko exists! kill wpa_supplicant! killall: wifimonitor: no process killed [Scan AP] Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

OK tempbuff is less than 20 character / # ping 1.1.1.1 connect: Network is unreachable / #

V4A001 commented 4 years ago

So I do not understand what could be wrong. Is my wifi password maybe too long? But why does it say it connects an then ping is not possible anymore and thus downloading either a new docker image.

/ # docker pull mcr.microsoft.com/aivision/visionsamplemodule:1.1.3-arm32v7 Error response from daemon: Get https://mcr.microsoft.com/v2/: tls: failed to parse certificate from server: asn1: syntax error: invalid boolean / # iotedge check Configuration checks

√ config.yaml is well-formed √ config.yaml has well-formed connection string √ container engine is installed and functional √ config.yaml has correct hostname × config.yaml has correct URIs for daemon mgmt endpoint Unable to find image 'mcr.microsoft.com/azureiotedge-diagnostics:1.0.7' locally docker: Error response from daemon: Get https://mcr.microsoft.com/v2/: tls: failed to parse certificate from server: asn1: syntax error: invalid boolean. See 'docker run --help'. ‼ latest security daemon Error while fetching latest versions of edge components: could not send HTTP request

V4A001 commented 4 years ago

I can connect to WIFI now through my mobile phone and tried again with my AP. With uPPER/Lower in the command it works. But on reboot it is still not. Can I give a save command to this setup?

After reboot again no connectivity.

Can it be that I have strange chars in my WIFI? Like a - and a pwd with a %. My pwd is also long.

Error response from daemon: Get https://mcr.microsoft.com/v2/: tls: failed to parse certificate from server: asn1: syntax error: invalid boolean / # connectap cisco-guests <difficult password with %5 in it> -1 [Connect AP] SSID=cisco-guests, key=<difficult password with %5 in it>, Security=-1 wlan.ko exists! killall: wifimonitor: no process killed [Scan AP] Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

OK tempbuff is less than 20 character / # ping 1.1.1.1 connect: Network is unreachable / # connectap MyMobile -1 [Connect AP] SSID=MyMobile, key=, Security=-1 wlan.ko exists! killall: wifimonitor: no process killed [Scan AP] Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

apssid MyMobile

Find AP:0a:c5:e1:bd:06:04 2437 -26 [WPA2-PSK-CCMP][ESS] MyMobile

OK [Connectap] AP Proto is WPA2 CCMP kill wpa_supplicant! Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

udhcpc (v1.24.1) started Setting IP address 0.0.0.0 on wlan0 Sending discover... Sending discover... Sending select for 192.168.43.123... Lease of 192.168.43.123 obtained, lease time 3600 Setting IP address 192.168.43.123 on wlan0 Deleting routers route: SIOCDELRT: No such process Adding router 192.168.43.1 Recreating /var/run/resolv.conf Adding DNS server 192.168.43.1 connect to MyMobile successfully! find softap0 / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=55 time=142 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=55 time=34.0 ms ^C --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 34.011/88.452/142.893/54.441 ms / # connectap CISCO-GUESTS <difficult password with %5 in it> -1 [Connect AP] SSID=CISCO-GUESTS, key=<difficult password with %5 in it>, Security=-1 wlan.ko exists! kill wpa_supplicant! kill udhcpc! [Scan AP] Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

apssid CISCO-GUESTS

Find AP:a0:55:4f:68:a7:f2 5180 -34 [WPA2-PSK-CCMP][ESS] CISCO-GUESTS

OK [Connectap] AP Proto is WPA2 CCMP kill wpa_supplicant! Successfully initialized wpa_supplicant eap_proxy:eap_proxy_get_imsi: Not initialized

udhcpc (v1.24.1) started Setting IP address 0.0.0.0 on wlan0 Sending discover... Sending discover... Sending select for 192.168.10.135... Lease of 192.168.10.135 obtained, lease time 172800 Setting IP address 192.168.10.135 on wlan0 Deleting routers route: SIOCDELRT: No such process Adding router 192.168.10.1 Recreating /var/run/resolv.conf Adding DNS server 8.8.8.8 Adding DNS server 8.8.4.4 connect to CISCO-GUESTS successfully! find softap0 / # PING 1.1.1.1 /bin/sh: PING: not found / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=6.27 ms 64 bytes from 1.1.1.1: icmp_seq=2 ttl=59 time=5.88 ms ^C --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 5.883/6.078/6.274/0.210 ms / #

V4A001 commented 4 years ago

I think there is a bug in the saving and retrieval of SSID's with complexer users an password. Can I see the source code some where. I saw wlanstart....it runs finally in something like this after no connection is present. The initial device id to connect to is also not there anymore.

/ # cat /data/misc/wifi/hostapd_virtual.conf ctrl_interface=/var/run/hostapd interface=softap0

driver=nl80211

ieee80211d=1

ieee80211n=1 hw_mode=g country_code=US ssid=ap99999 macaddr_acl=0 channel=0 wpa=2 wpa_passphrase=12345678 wpa_key_mgmt=WPA-PSK

wpa_pairwise=CCMP

rsn_pairwise=TKIP CCMP ht_capab=HT20 SHORT-GI-20 wmm_enabled=1 ignore_broadcast_ssid=0 / #

V4A001 commented 4 years ago

After 4 days of debugging I believe it is a bug. I connected my device to a hotspot repeater on 2.4 GHz with a more 'open source' and less 'type safe pwd'....simplessid and 1234567890 as pwd. It runs now..I will now change this new acces point to the difficult pwd. It is not 5GHz. Changing my home/office netwerk is much more complicated....

cat etc/version

v0.5280_Perf / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=7.07 ms ^C --- 1.1.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 7.073/7.073/7.073/0.000 ms / # cat /data/misc/wifi/hostapd_virtual.conf ctrl_interface=/var/run/hostapd interface=softap0

driver=nl80211

ieee80211d=1

ieee80211n=1 hw_mode=g country_code=US ssid=MSIoT_E1ED23 macaddr_acl=0 channel=0 wpa=2 wpa_passphrase=uyO314H3 wpa_key_mgmt=WPA-PSK

wpa_pairwise=CCMP

rsn_pairwise=TKIP CCMP ht_capab=HT20 SHORT-GI-20 wmm_enabled=1 ignore_broadcast_ssid=0 / # cat /data/misc/websettings/setting.conf {"connect_ap":{"ssid":"simplessid","password":"xx"},"username":"device_admin","password":"xx","io_t_connect_string":"","ssh_enable":true,"user_dns":{"enabled":true,"value":"8.8.8.8"}}/ # / # --- 1.1.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 14.813/20.178/25.543/5.365 ms / # reboot

adb ^C / # adb shell /bin/sh: adb: not found / # ping 1.1.1. ^C / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. From 192.168.0.101 icmp_seq=1 Destination Host Unreachable From 192.168.0.101 icmp_seq=2 Destination Host Unreachable From 192.168.0.101 icmp_seq=3 Destination Host Unreachable ^C --- 1.1.1.1 ping statistics --- 6 packets transmitted, 0 received, +3 errors, 100% packet loss, time 5173ms pipe 4 / #

V4A001 commented 4 years ago

I confirm that it is a bug. I have now as SSID on the previously working device. SSID: dlink-A23B with a long pwd with a % in it and caps/ lower of about 16 long randomly from LastPass.

cat etc/version

v0.5280_Perf / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=1309 ms ^C --- 1.1.1.1 ping statistics --- 2 packets transmitted, 1 received, 50% packet loss, time 1032ms rtt min/avg/max/mdev = 1309.994/1309.994/1309.994/0.000 ms, pipe 2 / # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES / # docker images REPOSITORY TAG IMAGE ID CREATED SIZE mcr.microsoft.com/aivision/visionsamplemodule 1.1.3-arm32v7 f8a7667eb3ea 5 months ago 659MB mcr.microsoft.com/aivision/visionsamplemodule webstream_0.0.13-arm32v7 f5a04d2c1e25 5 months ago 256MB mcr.microsoft.com/azureiotedge-hub 1.0 3ec47d87bfe6 6 months ago 244MB mcr.microsoft.com/azureiotedge-agent 1.0.7.1 c885e9f568a8 6 months ago 229MB / # docker images REPOSITORY TAG IMAGE ID CREATED SIZE mcr.microsoft.com/azureiotedge-agent 1.0 7654bbe73d3c 2 days ago 205MB mcr.microsoft.com/aivision/visionsamplemodule 1.1.3-arm32v7 f8a7667eb3ea 5 months ago 659MB mcr.microsoft.com/aivision/visionsamplemodule webstream_0.0.13-arm32v7 f5a04d2c1e25 5 months ago 256MB mcr.microsoft.com/azureiotedge-hub 1.0 3ec47d87bfe6 6 months ago 244MB mcr.microsoft.com/azureiotedge-agent 1.0.7.1 c885e9f568a8 6 months ago 229MB / # docker images REPOSITORY TAG IMAGE ID CREATED SIZE mcr.microsoft.com/azureiotedge-agent 1.0 7654bbe73d3c 2 days ago 205MB mcr.microsoft.com/aivision/visionsamplemodule 1.1.3-arm32v7 f8a7667eb3ea 5 months ago 659MB mcr.microsoft.com/aivision/visionsamplemodule webstream_0.0.13-arm32v7 f5a04d2c1e25 5 months ago 256MB mcr.microsoft.com/azureiotedge-hub 1.0 3ec47d87bfe6 6 months ago 244MB mcr.microsoft.com/azureiotedge-agent 1.0.7.1 c885e9f568a8 6 months ago 229MB / # docker images REPOSITORY TAG IMAGE ID CREATED SIZE mcr.microsoft.com/azureiotedge-agent 1.0 7654bbe73d3c 2 days ago 205MB mcr.microsoft.com/aivision/visionsamplemodule 1.1.3-arm32v7 f8a7667eb3ea 5 months ago 659MB mcr.microsoft.com/aivision/visionsamplemodule webstream_0.0.13-arm32v7 f5a04d2c1e25 5 months ago 256MB mcr.microsoft.com/azureiotedge-hub 1.0 3ec47d87bfe6 6 months ago 244MB mcr.microsoft.com/azureiotedge-agent 1.0.7.1 c885e9f568a8 6 months ago 229MB / # ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data. From 192.168.0.101 icmp_seq=3 Destination Host Unreachable From 192.168.0.101 icmp_seq=4 Destination Host Unreachable From 192.168.0.101 icmp_seq=5 Destination Host Unreachable ^C --- 1.1.1.1 ping statistics --- 6 packets transmitted, 0 received, +3 errors, 100% packet loss, time 5177ms pipe 4 / # reboot / # ping 1.1.1.1 connect: Network is unreachable / #

Devinwong commented 4 years ago

Probably the % in the pwd is not handled correctly. I will try to repro this from my side. Thanks for your finding.

V4A001 commented 4 years ago

My access point works now. I created a new ssid and replaced the % with a P, but still had the issue (I believe, but maybe I did a ping to fast; need to wait till green blinks) then I shorted my password to 9. Recommended by ngov is about 64 chars long or even longer for non-2 factor.

Note: I did leave the ssh out as well. also % and tokens there.

v-prsasa commented 4 years ago

@V4A001 Any update on this?

Devinwong commented 4 years ago

The issue was fixed. Validating the new firmware. Will release it once it's done. probably within 2 weeks.

PuneetRahejaMS commented 4 years ago

Closing Since the fix will be available in the next Firmware release.

sadranyi commented 2 years ago

has this been resolved, I am still having the same issues, can someone point me to some materials and resources