stormshift / support

This repo should serve as a central source for reporting issues with stormshift
GNU General Public License v3.0
3 stars 0 forks source link

Configure / Setup NetApp #87

Closed rbo closed 2 years ago

rbo commented 2 years ago
rbo commented 2 years ago

Stopped all Storage VM's

image

DanielFroehlich commented 2 years ago

to move stormshift from the internal NFS servers on storm3 and storm6 to netapp NFS, we would need around 4TB of NFS storage. And a means to create NFS exports/volumes via ansible, for OCP dynamic NFS provisionors.

rbo commented 2 years ago

@DanielFroehlich Currently in use at storm3 850G and storm6 8.1G, so 4 TB is really "pessimistic" ?

Details from storm3 & storm6 ### Storm3 ``` [root@storm3 storage]# df -h . Filesystem Size Used Avail Use% Mounted on /dev/mapper/sataraid-rhev 3.0T 850G 2.2T 28% /var/rhev [root@storm3 storage]# cat /etc/exports /var/rhev/storage/nfs/data *(async,wdelay,rw) /var/rhev/storage/nfs/export *(async,wdelay,rw) /var/rhev/storage/nfs/iso *(async,wdelay,rw) /var/rhev/storage/nfs/ocp1-reg *(async,wdelay,rw) /var/rhev/storage/nfs/ocp1-dyn-nfs *(async,wdelay,rw) /var/rhev/storage/nfs/ocp2-dyn-nfs *(async,wdelay,rw) /var/rhev/storage/nfs/ocp2-stat-nfs *(async,wdelay,rw,no_root_squash) /var/rhev/storage/nfs/ocp5-reg *(async,wdelay,rw) /var/rhev/storage/nfs/ocp5-dyn-nfs *(async,wdelay,rw,no_root_squash) /var/rhev/storage/nfs/rbohne-dyn-nfs *(async,wdelay,rw,no_root_squash) /var/rhev/storage/nfs/rhacm-dyn-nfs *(async,wdelay,rw) /var/rhev/storage/nfs/mmgt-dyn-nfs *(async,wdelay,rw) /var/rhev/storage/nfs/ocp6-dyn-nfs *(async,wdelay,rw) /var/rhev/storage/nfs/ocp7-dyn-nfs *(async,wdelay,rw) [root@storm3 storage]# ``` ### Storm6 ``` [root@storm6 ~]# cat /etc/exports /var/rhev/storage/nfs/engine *(async,wdelay,rw) /var/rhev/storage/nfs/data *(async,wdelay,rw) /var/rhev/storage/nfs/export *(async,wdelay,rw) /var/rhev/storage/nfs/iso *(async,wdelay,rw) [root@storm6 ~]# df -h /var/rhev/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel_storm6-root 50G 8.1G 42G 17% / [root@storm6 **~]#** ```
DanielFroehlich commented 2 years ago

storm6 is using 2.3T, 92% full 2.3T(storm6) +0.8T (storm3) == 3.1T currently in use, so 4TB is more realistic I would say.

[root@storm6 ~]# df -h
Filesystem                                              Size  Used Avail Use% Mounted on
/dev/mapper/sataraid-rhev_data                          2.5T  2.3T  219G  92% /var/rhev/storage/nfs/data
rbo commented 2 years ago

Ah my fault:

[root@storm6 ~]# df -h | grep /var/rhev | grep ^/dev
/dev/mapper/sataraid-rhev_iso                            50G  4.4G   46G   9% /var/rhev/storage/nfs/iso
/dev/mapper/sataraid-rhev_export                         50G  2.4G   48G   5% /var/rhev/storage/nfs/export
/dev/mapper/sataraid-rhev_engine                        100G   21G   80G  21% /var/rhev/storage/nfs/engine
/dev/mapper/sataraid-rhev_data                          2.5T  2.3T  216G  92% /var/rhev/storage/nfs/data
[root@storm6 ~]#

That's include all Virtual Machine disk images?

DanielFroehlich commented 2 years ago

all VM images that are on RHV storage domain "data6" - there are some also on storm3. At some point in time, storm3 became to small, hence I added a 2nd one. plus some OCP IPI RHV storage provisioning volumes on storm6. If we could all consilidate this to netapp that would be great, and probably increase our availablibilty because netapp is much more stable then storm3 (which hangs currently). So 4TB netapp storage would be min I say, if we could get 5TB we would have more headroom.

rbo commented 2 years ago

Got it thanks

rbo commented 2 years ago

All data is delete on the NetApp, not Volumes, Storage VM,... available anymore

rbo commented 2 years ago

At arista:

localhost>show interfaces status 
Port      Name              Status       Vlan        Duplex  Speed Type        
...
Et41      NetApp1           connected    9             full  10000 10GBASE-CR  
Et42      NetApp2           connected    9             full  10000 10GBASE-CR  
...
localhost>conf
localhost>enable
localhost#conf t
localhost(config)#int e41
localhost(config-if-Et41)#show interface status
localhost(config-if-Et41)#int e42              
localhost(config-if-Et42)#switchport mode trunk

localhost>show interfaces status 
Port      Name              Status       Vlan        Duplex  Speed Type        
..
Et41      NetApp1           connected    trunk         full  10000 10GBASE-CR  
Et42      NetApp2           connected    trunk         full  10000 10GBASE-CR  
...
localhost(config-if-Et42)#write memory 
localhost(config-if-Et42)#
rbo commented 2 years ago

@DanielFroehlich NFS for StormShift: 10.32.97.2:/stormshift_rhv - for new 1TB it grows automatically, de-duplication is enabled.

DanielFroehlich commented 2 years ago

Cool, Thx!

Two points: Permission issue: "Fehler beim Ausführen der Aktion Speicherverbindung hinzufügen: Berechtigungseinstellungen auf dem angegebenen Pfad erlauben keinen Zugriff auf den Speicher. Überprüfen Sie die Berechtigungseinstellungen auf dem angegebenen Speicherpfad."

Engine Mount: Can I get a 2nd path "stormshift_rhv_engine" of 100G for RHV hosted engine?

Thx Daniel

rbo commented 2 years ago

done: 10.32.97.1:/stormshift_rhv_engine

Permission issue is strange, I tested the mount and create data. Maybe we have to change some unix persmission. Is the Share some where mounted?

DanielFroehlich commented 2 years ago

vdsm log on storm3 (where the mount is tested):

2022-07-05 09:22:36,126+0200 INFO  (jsonrpc/6) [vdsm.api] START connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'password': '********', 'protocol_version': 'auto', 'port': '', 'iqn': '', 'connection': '10.32.97.2:/stormshift_rhv', 'ipv6_enabled': 'false', 'id': '00000000-0000-0000-0000-000000000000', 'user': '', 'tpgt': '1'}]) from=::ffff:10.32.105.40,35720, flow_id=6d384ae3-9a7c-4191-882c-147966ed5c26, task_id=04f05fb3-1bdd-4336-bedd-40a5da64454b (api:48)
2022-07-05 09:22:36,127+0200 INFO  (jsonrpc/6) [storage.storageServer] Creating directory '/rhev/data-center/mnt/10.32.97.2:_stormshift__rhv' (storageServer:234)
2022-07-05 09:22:36,127+0200 INFO  (jsonrpc/6) [storage.fileutils] Creating directory: /rhev/data-center/mnt/10.32.97.2:_stormshift__rhv mode: None (fileUtils:231)
2022-07-05 09:22:36,128+0200 INFO  (jsonrpc/6) [storage.mount] mounting 10.32.97.2:/stormshift_rhv at /rhev/data-center/mnt/10.32.97.2:_stormshift__rhv (mount:207)
2022-07-05 09:22:36,272+0200 INFO  (jsonrpc/6) [IOProcessClient] (Global) Starting client (__init__:340)
2022-07-05 09:22:36,284+0200 INFO  (ioprocess/2504967) [IOProcess] (Global) Starting ioprocess (__init__:465)
2022-07-05 09:22:36,296+0200 WARN  (jsonrpc/6) [storage.oop] Permission denied for directory: /rhev/data-center/mnt/10.32.97.2:_stormshift__rhv with permissions:7 (outOfProcess:193)
2022-07-05 09:22:36,296+0200 INFO  (jsonrpc/6) [storage.mount] unmounting /rhev/data-center/mnt/10.32.97.2:_stormshift__rhv (mount:215)
2022-07-05 09:22:36,346+0200 ERROR (jsonrpc/6) [storage.storageServer] Could not connect to storage server (storageServer:92)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 82, in validateDirAccess
    getProcPool().fileUtils.validateAccess(dirPath)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 194, in validateAccess
    raise OSError(errno.EACCES, os.strerror(errno.EACCES))
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 90, in connect_all
    con.connect()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 524, in connect
    return self._mountCon.connect()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 258, in connect
    six.reraise(t, v, tb)
  File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 251, in connect
    self.getMountObj().getRecord().fs_file)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 93, in validateDirAccess
    raise se.StorageServerAccessPermissionError(dirPath)
vdsm.storage.exception.StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/10.32.97.2:_stormshift__rhv'
2022-07-05 09:22:36,347+0200 INFO  (jsonrpc/6) [storage.storagedomaincache] Invalidating storage domain cache (sdc:74)
2022-07-05 09:22:36,347+0200 INFO  (jsonrpc/6) [vdsm.api] FINISH connectStorageServer return={'statuslist': [{'id': '00000000-0000-0000-0000-000000000000', 'status': 469}]} from=::ffff:10.32.105.40,35720, flow_id=6d384ae3-9a7c-4191-882c-147966ed5c26, task_id=04f05fb3-1bdd-4336-bedd-40a5da64454b (api:54)
2022-07-05 09:22:36,683+0200 INFO  (jsonrpc/1) [vdsm.api] START repoStats(domains=['24aaade7-ee5a-49a4-8220-5dbb4f71f8aa']) from=::1,44446, task_id=5c03fbb0-e8f5-4f6a-b6da-0e1cba80eb44 (api:48)
2022-07-05 09:22:36,683+0200 INFO  (jsonrpc/1) [vdsm.api] FINISH repoStats return={'24aaade7-ee5a-49a4-8220-5dbb4f71f8aa': {'code': 0, 'lastCheck': '2.8', 'delay': '0.000278952', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::1,44446, task_id=5c03fbb0-e8f5-4f6a-b6da-0e1cba80eb44 (api:54)
2022-07-05 09:22:39,203+0200 INFO  (jsonrpc/4) [api.host] START getStats() from=::ffff:10.32.105.40,35720 (api:48)
2022-07-05 09:22:39,238+0200 INFO  (jsonrpc/4) [vdsm.api] START repoStats(domains=()) from=::ffff:10.32.105.40,35720, task_id=91eb9057-6bc9-40ef-982b-32752945d2a8 (api:48)
2022-07-05 09:22:39,238+0200 INFO  (jsonrpc/4) [vdsm.api] FINISH repoStats return={'24aaade7-ee5a-49a4-8220-5dbb4f71f8aa': {'code': 0, 'lastCheck': '0.5', 'delay': '0.000394695', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '329372c9-086b-42e7-8e4c-7b6a36ff0491': {'code': 0, 'lastCheck': '5.8', 'delay': '0.000179837', 'valid': True, 'version': 0, 'acquired': True, 'actual': True}, '4665b9ab-a632-4436-817d-2de8f1c7e363': {'code': 0, 'lastCheck': '5.8', 'delay': '0.000342414', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, 'b7faa86a-9209-474b-9e70-f8acabbbe23e': {'code': 0, 'lastCheck': '5.8', 'delay': '0.000234192', 'valid': True, 'version': 0, 'acquired': True, 'actual': True}, '2fb5374d-0ec9-407e-bdf3-6ceac99a4480': {'code': 0, 'lastCheck': '7.9', 'delay': '0.000227582', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::ffff:10.32.105.40,35720, task_id=91eb9057-6bc9-40ef-982b-32752945d2a8 (api:54)
DanielFroehlich commented 2 years ago

with 777 permission on the export mount from RHV does work!

rbo commented 2 years ago

Trident is configured at our BareMetal COE Cluster https://console-openshift-console.apps.cluster.coe.muc.redhat.com/

How to configure trident on your OpenShift Cluster is documented here: https://github.com/stormshift/iac/blob/main/docs/netapp-trident.md

oc get storageclass| grep ^coe
coe-netapp-nas                                 csi.trident.netapp.io                   Delete          Immediate              false                  6m13s
coe-netapp-san                                 csi.trident.netapp.io                   Delete          Immediate     

NAS = NFS SAN = iSCSI

rbo commented 2 years ago

@DanielFroehlich can you change your NFS Mounts to: 10.32.97.2:/stormshift_rhv => 10.32.97.1:/stormshift_rhv - thanks & sorry (The one instead of two at the end of the IP)

DanielFroehlich commented 2 years ago

Stormshift RHV domain switch to 10.32.97.1

DanielFroehlich commented 2 years ago

Strange storgae usage: RHV disk is 497G, but used is 1.3T???

[root@storm3 0c2d5fff-fe37-4bf0-8cd4-9eb916e2a88f]# du -hs .
497G    .
[root@storm3 0c2d5fff-fe37-4bf0-8cd4-9eb916e2a88f]# df -h .
Filesystem                  Size  Used Avail Use% Mounted on
10.32.97.2:/stormshift_rhv  1.4T  1.3T   36G  98% /root/dan
[root@storm3 0c2d5fff-fe37-4bf0-8cd4-9eb916e2a88f]# 

Maybe some kind of snapshots active?

rbo commented 2 years ago

Yes snapshots are active. I can disable for rhv if you like

DanielFroehlich commented 2 years ago

Yep, I see them now: du -hs .* 4.8T . 15G .. [root@storm3 dan]# ls -la total 16 drwxrwxrwx. 3 root root 4096 Jul 6 11:03 . dr-xr-x---. 13 root root 4096 Jul 6 10:56 .. drwxrwxrwx. 10 root root 4096 Jul 6 11:05 .snapshot drwxr-xr-x. 4 vdsm kvm 4096 Jul 6 10:16 0c2d5fff-fe37-4bf0-8cd4-9eb916e2a88f

already 4.2T of snapshot. We dont need that. please disable. Thx Daniel

rbo commented 2 years ago

done and snapshots deleted.

rbo commented 2 years ago

FYI: don't trust du to much on a netapp nfs share ;-)