quantum / esos

An open source, high performance, block-level storage platform.
http://www.esos-project.com/
Other
284 stars 58 forks source link

Help need #73

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hello, 
I followed the guide http://marcitland.blogspot.it, but unfortunately it does 
not work 
or at least part of the cluster drbd and clvm, 
I have since tried other solutions but I can not.

can you help me?

i use r626

crm(live)configure# primitive p-clvm ocf:lvm2:clvmd \
>        params daemon_timeout="30" \
>        op monitor interval="60" timeout="30" \
>        op start interval="0" timeout="90" \
>        op stop interval="0" timeout="100"
ERROR: ocf:lvm2:clvmd: could not parse meta-data:
ERROR: ocf:lvm2:clvmd: no such resource agent

And
------------------------------------------
crm(live)# configure
crm(live)configure# template
crm(live)configure template# list

Original issue reported on code.google.com by dotcom...@gmail.com on 10 Apr 2014 at 6:10

GoogleCodeExporter commented 9 years ago
I'll take a look later this evening after the new image has posted.

--Marc

Original comment by msmith...@gmail.com on 11 Apr 2014 at 1:16

GoogleCodeExporter commented 9 years ago
Hi,

Sorry for the delay. I took a look, and there is no resource agent "lvm2:clvmd" 
in ESOS (/usr/lib/ocf/resource.d/lvm2).

In the article I wrote, I don't use a resource agent for clvmd, its started in 
ESOS init/rc (rc.clvmd).

Is there a reason you need to run clvmd using an RA?

--Marc

Original comment by msmith...@gmail.com on 13 Apr 2014 at 11:57

GoogleCodeExporter commented 9 years ago
ok, i just resolve!

but now i have a different problem, when i format from esx the Datastore i get 
this error:

 nMI: PCI system error (SERR) for reason b1 on CPU 0 
   Dazed and confused, but trying to continue

from both 2 nodes.

can you help me?

Original comment by dotcom...@gmail.com on 14 Apr 2014 at 2:12

GoogleCodeExporter commented 9 years ago
Can you produce a support package (TUI->Interface) and attach it?

Original comment by msmith...@gmail.com on 14 Apr 2014 at 2:46

GoogleCodeExporter commented 9 years ago
i change with  vdisk_fileio , and working.

but does not working pacemaker i think!

have you tested resource agente???
 if yes can you write to me how i can launch with shell?

because i try :
OCF_RESKEY_CRM_meta_notify_start_uname="" \
OCF_RESKEY_CRM_meta_clone_max=2 \
ocf-tester -n p_drbd_r0 -o drbd_resource="r0" 
/usr/lib/ocf/resource.d/linbit/drbd

and i have an error : 
/usr/lib/ocf/resource.d/linbit/drbd: line 288: syntax error: bad substitution
* rc=2: Your agent has too restrictive permissions: should be 755

 and if i put in off 1 node the service doesn't start.
and after i put node 1 in on the drbd keep in secondary not in primary.

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 1:18

GoogleCodeExporter commented 9 years ago
I'm not sure I understand all of your comment... are you having trouble with 
the DRBD RA?

Original comment by msmith...@gmail.com on 15 Apr 2014 at 1:27

GoogleCodeExporter commented 9 years ago
in your article the are some simply errors:

crm
cib new constraints
colocation c_r0_r1 inf: ms_scst:Started clone_lvm:Started ms_drbd:Master
order o_r0_r1 inf: ms_drbd:promote clone_lvm:start ms_scst:start
cib commit constraints
quit

is:
crm
cib new constraints
configure colocation c_r0_r1 inf: ms_scst:Started clone_lvm:Started 
ms_drbd:Master
configure order o_r0_r1 inf: ms_drbd:promote clone_lvm:start ms_scst:start
cib commit constraints
quit

# 20130410 MAS

resource r0 {
        net {
                allow-two-primaries;
        }
        on cantaloupe.mcc.edu {
                device     /dev/drbd0;
                disk       /dev/disk-by-id/LUN_NAA-600605b0054a753018f855fa236d6d41;
                address    192.168.50.21:7788;
                meta-disk  internal;
        }
        on raisin.mcc.edu {
                device    /dev/drbd0;
                disk      /dev/disk-by-id/LUN_NAA-600605b0054a751018f856b51625577a;
                address   192.168.50.22:7788;
                meta-disk internal;
        }
}

in official guide the directive is

allow-two-primaries yes;

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 1:29

GoogleCodeExporter commented 9 years ago
That article was written a year ago, and it was written using the version of 
ESOS described in the article. I'm sure you're not using the same version.

Original comment by msmith...@gmail.com on 15 Apr 2014 at 1:32

GoogleCodeExporter commented 9 years ago
yes, I'm using the release r628, I know they are different versions,

thanks for the guide I wanted to be of help. 

To summarize, I have made ​​your guide a cluster of esos, 
with the variant do not use InfiniBand. 
I tried using SCST in vdisk_block but when I try to create a datastore I get 
the error: NMI: PCI system error (SERR) for reason b1 on CPU 0 
                           Dazed and confused, but 
trying to continue 

Then I tried to use vdisk_fileio (maybe I'm wrong) and everything works. 

1) but when I try to throw down a node, vmware does not see me more datastore. 
2) when I restart the node just off I crm_mon in errors such as: 
  p_lvm_r0_stop error on node ....... 
3) the drbd node remains in Secondary just lit until you change it by hand.

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 6:26

GoogleCodeExporter commented 9 years ago
Last updated: Tue Apr 15 18:45:11 2014
Last change: Tue Apr 15 18:31:55 2014 via cibadmin on esosc2.dytech.local
Stack: corosync
Current DC: esosc1.dytech.local (1) - partition with quorum
Version: 1.1.10-9d39a6b
2 Nodes configured
10 Resources configured

Online: [ esosc1.dytech.local esosc2.dytech.local ]

 Master/Slave Set: ms_drbd [g_drbd]
     Masters: [ esosc2.dytech.local ]
     Slaves: [ esosc1.dytech.local ]
 Clone Set: clone_notify [p_notify]
     Started: [ esosc1.dytech.local esosc2.dytech.local ]
fence_esosc1    (stonith:fence_ipmilan):        Started esosc1.dytech.local
fence_esosc2    (stonith:fence_ipmilan):        Started esosc2.dytech.local

Failed actions:
    p_scst_start_0 on esosc2.dytech.local 'unknown error' (1): call=79, status=complete, last-rc-change='Tue Apr 15 18:32:31 2014', queued=0ms, exec=226ms
    p_drbd_r0_promote_0 on esosc2.dytech.local 'unknown error' (1): call=58, status=complete, last-rc-change='Tue Apr 15 18:32:04 2014', queued=0ms, exec=3337ms
    p_drbd_r0_monitor_20000 on esosc1.dytech.local 'not running' (7): call=76, status=complete, last-rc-change='Tue Apr 15 18:43:39 2014', queued=0ms, exec=0ms

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 6:45

GoogleCodeExporter commented 9 years ago
i change my configuration to vdisk_block

after i make an datastore:

Message from syslogd@esosc1 at Tue Apr 15 19:48:05 2014 ...
esosc1 kernel: [ 4713.084002] NMI: PCI system error (SERR) for reason a1 on CPU 
0.

Message from syslogd@esosc1 at Tue Apr 15 19:48:05 2014 ...
esosc1 kernel: [ 4713.084002] Dazed and confused, but trying to continue

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 7:50

Attachments:

GoogleCodeExporter commented 9 years ago
in my dmsg i found:   

0.970396] pci 0000:09:00.0: disabling ASPM on pre-1.1 PCIe device.  You can 
enable it with 'pcie_aspm=force'
[    0.971145] pci 0000:00:02.0: PCI bridge to [bus 09-12]
[    0.972004] pci 0000:00:02.0:   bridge window [io  0x5000-0x5fff]
[    0.972007] pci 0000:00:02.0:   bridge window [mem 0xfdc00000-0xfddfffff]
[    0.973350] pci 0000:0a:00.0: [8086:3510] type 01 class 0x060400
[    0.975595] pci 0000:0a:00.0: PME# supported from D0 D3hot D3cold
[    0.976006] pci 0000:0a:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    0.977187] pci 0000:0a:01.0: [8086:3514] type 01 class 0x060400
[    0.979091] pci 0000:0a:01.0: PME# supported from D0 D3hot D3cold
[    0.979440] pci 0000:0a:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    0.980187] pci 0000:0a:02.0: [8086:3518] type 01 class 0x060400
[    0.982595] pci 0000:0a:02.0: PME# supported from D0 D3hot D3cold
[    0.983021] pci 0000:0a:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    0.984645] pci 0000:09:00.0: PCI bridge to [bus 0a-0f]
[    0.985003] pci 0000:09:00.0:   bridge window [io  0x5000-0x5fff]
[    0.985049] pci 0000:09:00.0:   bridge window [mem 0xfdd00000-0xfddfffff]
[    0.986472] pci 0000:0b:00.0: [1077:2432] type 00 class 0x0c0400
[    0.987411] pci 0000:0b:00.0: reg 0x10: [io  0x5000-0x50ff]
[    0.987914] pci 0000:0b:00.0: reg 0x14: [mem 0xfddf0000-0xfddf3fff 64bit]
[    0.989183] pci 0000:0b:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[    0.992229] pci 0000:0b:00.1: [1077:2432] type 00 class 0x0c0400
[    0.992685] pci 0000:0b:00.1: reg 0x10: [io  0x5400-0x54ff]
[    0.993183] pci 0000:0b:00.1: reg 0x14: [mem 0xfdde0000-0xfdde3fff 64bit]
[    0.994457] pci 0000:0b:00.1: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[    0.997292] pci 0000:0b:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    0.998236] pci 0000:0a:00.0: PCI bridge to [bus 0b-0d]
[    0.999004] pci 0000:0a:00.0:   bridge window [io  0x5000-0x5fff]
[    0.999050] pci 0000:0a:00.0:   bridge window [mem 0xfdd00000-0xfddfffff]
[    1.000468] pci 0000:0a:01.0: PCI bridge to [bus 0e]
[    1.002514] pci 0000:0a:02.0: PCI bridge to [bus 0f]
[    1.005823] pci 0000:09:00.3: PCI bridge to [bus 10-12]
[    1.006399] pci 0000:06:00.0: [103c:3230] type 00 class 0x010400
[    1.006399] pci 0000:06:00.0: reg 0x10: [mem 0xfdb00000-0xfdbfffff 64bit]
[    1.006399] pci 0000:06:00.0: reg 0x18: [io  0x4000-0x40ff]
[    1.006399] pci 0000:06:00.0: reg 0x1c: [mem 0xfdaf0000-0xfdaf0fff 64bit]
[    1.007018] pci 0000:06:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[    1.007063] pci 0000:06:00.0: supports D1
[    1.007107] pci 0000:06:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    1.008007] pci 0000:00:03.0: PCI bridge to [bus 06-08]
[    1.009003] pci 0000:00:03.0:   bridge window [io  0x4000-0x4fff]
[    1.009006] pci 0000:00:03.0:   bridge window [mem 0xfda00000-0xfdbfffff]
[    1.009046] pci 0000:13:00.0: [103c:3230] type 00 class 0x010400
[    1.009046] pci 0000:13:00.0: reg 0x10: [mem 0xfdf00000-0xfdffffff 64bit]
[    1.009046] pci 0000:13:00.0: reg 0x18: [io  0x6000-0x60ff]
[    1.009056] pci 0000:13:00.0: reg 0x1c: [mem 0xfdef0000-0xfdef0fff 64bit]
[    1.009074] pci 0000:13:00.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
[    1.010043] pci 0000:13:00.0: supports D1
[    1.010099] pci 0000:13:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    1.011007] pci 0000:00:04.0: PCI bridge to [bus 13-15]
[    1.012002] pci 0000:00:04.0:   bridge window [io  0x6000-0x6fff]
[    1.012006] pci 0000:00:04.0:   bridge window [mem 0xfde00000-0xfdffffff]
[    1.012044] pci 0000:00:05.0: PCI bridge to [bus 16]
[    1.013039] pci 0000:02:00.0: [1166:0103] type 01 class 0x060400
[    1.013081] pci 0000:02:00.0: PME# supported from D0 D3hot D3cold
[    1.013154] pci 0000:02:00.0: disabling ASPM on pre-1.1 PCIe device.  You 
can enable it with 'pcie_aspm=force'
[    1.014007] pci 0000:00:06.0: PCI bridge to [bus 02-03]
[    1.015004] pci 0000:00:06.0:   bridge window [mem 0xf8000000-0xf9ffffff]
[    1.015040] pci 0000:03:00.0: [14e4:164c] type 00 class 0x020000
[    1.015040] pci 0000:03:00.0: reg 0x10: [mem 0xf8000000-0xf9ffffff 64bit]

[   12.316157] serial8250: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is 
a 16550A
[   12.409670] Non-volatile memory driver v1.3
[   12.459775] Linux agpgart interface v0.103
[   12.509047] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, 
margin is 60 seconds).
[   12.616488] Hangcheck: Using getrawmonotonic().
[   12.670816] [drm] Initialized drm 1.1.0 20060810
[   12.760324] brd: module loaded
[   12.761380] loop: module loaded
[   12.761380] HP CISS Driver (v 3.6.26)
[   12.764060] cciss 0000:06:00.0: can't disable ASPM; OS doesn't have ASPM 
control
[   12.764063] pcieport 0000:00:03.0: driver skip pci_set_master, fix it!
[   12.764151] cciss 0000:06:00.0: irq 64 for MSI/MSI-X
[   12.764156] cciss 0000:06:00.0: irq 65 for MSI/MSI-X
[   12.764160] cciss 0000:06:00.0: irq 66 for MSI/MSI-X
[   12.764165] cciss 0000:06:00.0: irq 67 for MSI/MSI-X
[   12.864330] cciss 0000:06:00.0: cciss0: <0x3230> at PCI 0000:06:00.0 IRQ 64 
using DAC
[   12.867397] cciss 0000:13:00.0: can't disable ASPM; OS doesn't have ASPM 
control
[   12.867401] pcieport 0000:00:04.0: driver skip pci_set_master, fix it!
[   12.867493] cciss 0000:13:00.0: irq 68 for MSI/MSI-X
[   12.867500] cciss 0000:13:00.0: irq 69 for MSI/MSI-X
[   12.867507] cciss 0000:13:00.0: irq 70 for MSI/MSI-X
[   12.867513] cciss 0000:13:00.0: irq 71 for MSI/MSI-X
[   12.947070] cciss 0000:13:00.0: cciss1: <0x3230> at PCI 0000:13:00.0 IRQ 68 
using DAC
[   12.726323] Floppy drive(s): fd0 is 1.44M
[   13.475040] GPT:Primary header thinks Alt. header is not at the end of the 
disk.
[   13.563781] GPT:3726773919 != 3726887727
[   13.610763] GPT:Alternate GPT header not at the end of the disk.
[   13.682675] GPT:3726773919 != 3726887727
[   13.729655] GPT: Use GNU Parted to correct GPT errors.
[   13.791178]  cciss/c1d0: p1

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 8:43

GoogleCodeExporter commented 9 years ago
i use HP DL360 G5 and SAS CTRL P800 (firmware update) with MSA70

i found the new version kernel driver from HP. (in rpm format)
but you have this driver inside of kernel .

how i can add this driver ?   (i don't know if this resolve my problem!!)

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 8:45

GoogleCodeExporter commented 9 years ago
http://cciss.sourceforge.net/#docs

Original comment by dotcom...@gmail.com on 15 Apr 2014 at 9:04

GoogleCodeExporter commented 9 years ago

Original comment by msmith...@gmail.com on 27 Apr 2014 at 3:28