Closed migmeneses closed 10 months ago
Listing
the Share ID we are looking for is: 06633a54-8ec6-457b-be4e-a60b103ebb7a
It is the one we could reproduce the issue
openstack share instance list -c Status -c ID -c "Share ID" -c "Share Server ID"
+--------------------------------------+--------------------------------------+-----------------+--------------------------------------+
| ID | Share ID | Status | Share Server ID |
+--------------------------------------+--------------------------------------+-----------------+--------------------------------------+
| 0be6585c-1c8d-4ec6-b7de-401af2e139f8 | 06633a54-8ec6-457b-be4e-a60b103ebb7a | error | 7b861199-4fae-4f9a-9ade-96a4083cf2c0 |
| 0d4ad385-9bec-42b0-b98f-27d8714ecb2c | 1e1d41e5-d2b3-42ca-926b-eea9f15de2f2 | extending_error | 1c4e6020-a99e-4e49-ad91-c08b0403738a |
| 81d0fd88-5e4b-44ab-b447-52c3f614404d | 232f1603-089f-43ea-8b47-606599394d21 | error | 0dd5aa13-485c-4931-abbb-0e74d1fdf731 |
| f58ed3a2-d5c9-4cd2-8080-2eed0bc1db27 | 35b4cdb9-8ae6-41df-98fc-1ed5b65f67e5 | error | 18a691fd-3b1d-45f0-9d4d-5c49304324b5 |
| 5cfaf4a6-3460-4991-9eb4-39db6a9c04bd | 5976c3b0-cf3a-4ea0-82c4-81b40ec4cdc1 | available | 7b861199-4fae-4f9a-9ade-96a4083cf2c0 |
| b13e43c8-ee67-44bd-b125-d728932464cc | 5e192a0c-7b4f-4c0f-93f2-591d9b2f2cf7 | extending_error | 54436e7a-2d25-4bd5-be36-9e3049cfad2c |
| 61f541fa-0070-4d99-a0b9-bdd29bb01e3c | 82ffd590-76d1-457e-be3c-29d1591eb2b1 | error | 1c4e6020-a99e-4e49-ad91-c08b0403738a |
| db69ede0-baae-4528-961a-1b8ac638e885 | 8381dee1-0ed2-4cb9-a164-42475883f7d2 | available | 7b861199-4fae-4f9a-9ade-96a4083cf2c0 |
| 4bf0a51f-21c9-4cd4-a8e3-624c9602407b | 97aafc2d-1a32-4193-b5a6-afc7b2707725 | extending_error | 864152f0-8e54-408f-b32c-b90734d5145c |
| ce4b2764-80c2-48b2-8c55-40c705643f2e | 9ddc9370-3d74-443a-9f9c-bc0657e4d837 | error | 7b861199-4fae-4f9a-9ade-96a4083cf2c0 |
| 34c536db-5b1a-483c-b349-ebcb6b6fe43c | c00351a8-ac1d-4c49-bb12-57b10c062ddd | extending_error | 864152f0-8e54-408f-b32c-b90734d5145c |
| 000ec30f-c6a9-40f7-ab2c-bd568eff649a | cd8fff13-3112-4166-b6d1-e6c0e81aff1f | available | 864152f0-8e54-408f-b32c-b90734d5145c |
| 6781e150-5d6f-47fb-988d-6ca99acd09f5 | d5bbab56-aed3-4a9c-8815-3258e34e4e26 | extending_error | 1c4e6020-a99e-4e49-ad91-c08b0403738a |
| 249c04fd-7134-4911-b8fb-b585933738b8 | e08bf7b3-ae61-47db-ab0b-d80b548b174a | available | 864152f0-8e54-408f-b32c-b90734d5145c |
| 82074794-0128-4070-8384-c2dca95ba7d6 | e6eae36c-a36b-4dad-b04c-584def02da04 | available | 54436e7a-2d25-4bd5-be36-9e3049cfad2c |
+--------------------------------------+--------------------------------------+-----------------+--------------------------------------+
openstack server list --all-project | grep 7b861199-4fae-4f9a-9ade-96a4083cf2c0
| e7dc4bc3-46ec-47bf-b256-2eaedffe478e | generic_7b861199-4fae-4f9a-9ade-96a4083cf2c0 | ACTIVE | manila_service_network=10.254.0.37; sr-default=10.10.1.114 | manila-service-image | m1.manila |
The instance is UP and Running ( ACTIVE ) I was enable to sshed into it
ping and check ssh port listeing
root@ctl1:~# ip netns exec qdhcp-6aac9402-b603-4a55-98e3-0d683374f2b8 ping -c 3 10.254.0.37
PING 10.254.0.37 (10.254.0.37) 56(84) bytes of data.
64 bytes from 10.254.0.37: icmp_seq=1 ttl=64 time=2.62 ms
64 bytes from 10.254.0.37: icmp_seq=2 ttl=64 time=0.755 ms
64 bytes from 10.254.0.37: icmp_seq=3 ttl=64 time=0.691 ms
--- 10.254.0.37 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2029ms
rtt min/avg/max/mdev = 0.691/1.356/2.624/0.896 ms
root@ctl1:~# ip netns exec qdhcp-6aac9402-b603-4a55-98e3-0d683374f2b8 nc -vz 10.254.0.37 22
Connection to 10.254.0.37 22 port [tcp/ssh] succeeded!
Inside the instance attached volumes:
manila@ubuntu:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 956M 0 956M 0% /dev
tmpfs 198M 2.1M 196M 2% /run
/dev/vda1 2.7G 1.6G 952M 63% /
tmpfs 986M 0 986M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 986M 0 986M 0% /sys/fs/cgroup
/dev/vdb 98G 20G 73G 22% /shares/share-db69ede0-baae-4528-961a-1b8ac638e885
/dev/vdc 20G 24K 19G 1% /shares/share-5cfaf4a6-3460-4991-9eb4-39db6a9c04bd
tmpfs 198M 0 198M 0% /run/user/1000
manila@ubuntu:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 20G 0 disk
└─vda1 252:1 0 2.9G 0 part /
vdb 252:16 0 100G 0 disk /shares/share-db69ede0-baae-4528-961a-1b8ac638e885
vdc 252:32 0 40G 0 disk /shares/share-5cfaf4a6-3460-4991-9eb4-39db6a9c04bd
logs:
root@ubuntu:/var/log/samba# cat log.nmbd
[2023/08/18 08:05:59.722908, 0] ../../source3/nmbd/nmbd.c:901(main)
nmbd version 4.15.13-Ubuntu started.
Copyright Andrew Tridgell and the Samba Team 1992-2021
[2023/08/18 08:05:59.724571, 0] ../../lib/util/become_daemon.c:150(daemon_status)
daemon_status: daemon 'nmbd' : No local IPv4 non-loopback interfaces available, waiting for interface ...
[2023/08/18 08:05:59.724586, 0] ../../source3/nmbd/nmbd_subnetdb.c:252(create_subnets)
NOTE: NetBIOS name resolution is not supported for Internet Protocol Version 6 (IPv6).
[2023/08/18 08:06:49.315253, 0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
*****
Samba name server UBUNTU is now a local master browser for workgroup WORKGROUP on subnet 10.254.0.37
*****
[2023/08/18 08:12:06.647508, 0] ../../source3/nmbd/nmbd_become_lmb.c:398(become_local_master_stage2)
*****
Samba name server UBUNTU is now a local master browser for workgroup WORKGROUP on subnet 10.10.1.114
@migmeneses after tests, I think the behavior is most likely to be triggered by temporary network timeout/block. Which I gonna guess https://github.com/vexxhost/atmosphere/issues/547 also suffered from it.
Here are few tasks we can go from here:
Let me check on the share type one.
Regard networks, can you also help me get netstat -s
arp -a
and full journalctl. or any other data that you found weird. thanks. You can share those in file through Direct message me so we don't overload this page :)
we were looking way too far :)
+-------------------------+------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------+------------------------------------------------------------------------------------------------+
| admin_state_up | UP |
| allowed_address_pairs | |
| binding_host_id | ctl1 |
| binding_profile | |
| binding_vif_details | |
| binding_vif_type | binding_failed |
| binding_vnic_type | normal |
| created_at | 2023-08-25T18:20:21Z |
| data_plane_status | None |
| description | |
| device_id | manila-share |
| device_owner | manila:share |
| device_profile | None |
| dns_assignment | fqdn='host-10-254-0-10.openstacklocal.', hostname='host-10-254-0-10', ip_address='10.254.0.10' |
| | fqdn='host-10-254-0-28.openstacklocal.', hostname='host-10-254-0-28', ip_address='10.254.0.28' |
| | fqdn='host-10-254-0-40.openstacklocal.', hostname='host-10-254-0-40', ip_address='10.254.0.40' |
| | fqdn='host-10-254-0-59.openstacklocal.', hostname='host-10-254-0-59', ip_address='10.254.0.59' |
| | fqdn='host-10-254-0-78.openstacklocal.', hostname='host-10-254-0-78', ip_address='10.254.0.78' |
| dns_domain | |
| dns_name | |
| extra_dhcp_opts | |
| fixed_ips | ip_address='10.254.0.10', subnet_id='299d2e30-fd9d-40ca-aa0a-eb4bc57bf4ef' |
| | ip_address='10.254.0.28', subnet_id='742b3d58-4e21-4dc1-9591-5419695197c4' |
| | ip_address='10.254.0.40', subnet_id='493e4d1a-0d64-4dde-85de-cfa852f6d45f' |
| | ip_address='10.254.0.59', subnet_id='fb2f3007-6c3d-4300-a403-ccc9e9e40bd7' |
| | ip_address='10.254.0.78', subnet_id='d1bfda43-4209-45db-a52c-1e53ab669dba' |
| id | 2425d9ef-af85-4fbc-aaeb-b852c4dde1c5 |
| ip_allocation | immediate |
| mac_address | fa:16:3e:6b:ba:75 |
| name | |
| network_id | 6aac9402-b603-4a55-98e3-0d683374f2b8 |
| numa_affinity_policy | None |
| port_security_enabled | False |
| project_id | e6f5006edea941169a9771e748887742 |
| propagate_uplink_status | None |
| qos_network_policy_id | None |
| qos_policy_id | None |
| resource_request | None |
| revision_number | 6 |
| security_group_ids | |
| status | DOWN |
| tags | |
| trunk_details | None |
| updated_at | 2023-09-13T10:39:51Z |
+-------------------------+------------------------------------------------------------------------------------------------+
the port binding_host_id
is not the FQDN and it's failing to bind it successfully. We need to figure out where that's set from Manila and pass that through properly.
Stinky.
Manila uses socket.gethostname()
with no way of configuring this.
Waiting for CI on https://review.opendev.org/c/openstack/manila/+/896692
Once this lands, we will need to update the CONF.host
to be the FQDN when manila-share
starts up.
The Manila chart needs to set [DEFAULT]/host
to the FQDN of the system, you can see how this is done in other services like Nova.
https://github.com/vexxhost/atmosphere/blob/ea7e98410b095244a11940f542f82b3fd9a8675c/charts/nova/templates/bin/_nova-compute-init.sh.tpl#L63-L69 https://github.com/vexxhost/atmosphere/blob/ea7e98410b095244a11940f542f82b3fd9a8675c/charts/nova/templates/bin/_nova-compute.sh.tpl#L23-L25
We will also need to follow up on the upstream change + make sure to backport to stable/zed
https://review.opendev.org/q/I4181a6f1527c80bf356d6363300b2d420921e7fa
requested backports
track manila work on allow config port host with seperate configs https://review.opendev.org/c/openstack/manila/+/897077
we might be able to backport https://review.opendev.org/c/openstack/manila/+/897077 after all (see comments inside)
I am going to mark this issue as solved As the bug fixed has been released in the new version of manila.
Regards
We still need this issue to track https://review.opendev.org/c/openstack/manila/+/897077 and https://github.com/vexxhost/atmosphere/pull/668
all solved now
via the Dashboard, created a share with 10GB size the creation process takes a long time ( around 5-10 mins), and never finishes and falls into error.
Steps:
creating a new share:
It looks like it is going to re-use an existing instance
It falls into error:
it looks like the error happens when manila tries to re-use an existing instance