Open th3penguinwhisperer opened 7 years ago
I seem to get this error when I use the http://xyz without a trailing slash in the storageclass. I could capture it with tcpdump. Not sure if this is actually my problem or just because I've been fiddling so much.
HTTP/1.1 500 Internal Server Error
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Mon, 24 Jul 2017 19:35:53 GMT
Content-Length: 68
Set-Cookie: ff1a8205a9a2f52921c45408892389cf=d6838f12e85f29239853163010fa57a7; path=/; HttpOnly
Error calling v.allocBricksInCluster: database is in read-only mode
Error: Error calling v.allocBricksInCluster: database is in read-only mode
So I can recreate the same error empty error message:
[root@openshift-master ~]# heketi-cli --json=true --user admin --secret mypassword -s http://heketi-default.cloud.xyz volume create --size=1 --persistent-volume-file=pv001.json
Error: Error calling v.allocBricksInCluster: database is in read-only mode
[root@openshift-master ~]# heketi-cli --json=true --user admin --secret mypassword -s http://heketi-default.cloud.xyz/ volume create --size=1 --persistent-volume-file=pv001.json
Error:
@th3penguinwhisperer Could you please reformat your comments to have ```
around your snippets of output? :)
I think I've got a step further:
When using the URL with the trailing slash it hides the actual error it seems. When removing this I see the db in read-only mode. Not sure why and how to fix that in the future but what I did was kill the heketi process (NOT the pod as that didn't help). After that a new process is spawned and it seems to work (or perhaps I'm lucky and it's now on the node where it's working and it doesn't work on the other ones).
Note that at heketi pod startup I see this maessage in the log:
[heketi] WARNING 2017/07/24 19:48:00 Unable to open database. Retrying using read only mode
So that's probably making it go to read only mode.
Now that I know what is causing this and how I can work around it perhaps there's something that can be changed to the deploymentconfig that gk-deploy creates so that there's a small delay before creating a new pod or something similar?
So what seems to be the workaround? Where shuld this delay go?
I'm not sure if this delay is even possible. However when the existing pod is deleted, the new one should have a small delay before it gets recreated.
But I'll have to verify again if after the DB got out of readonly mode I can retrigger the issue by deleting the heketi pod. I'll post an update.
have you got anywhere with this @th3penguinwhisperer ? cheers
Hi,
Thanks for writing this nice tool to deploy gluster on openshift. However I still seem to be stuck with the above error in the logs. It seems the gluster pods are running and the heketi pod as well. The endpoints, ... are all available.
However every claim I try goes to pending state and stays there.
The gluster cluster is put in the default namespace.
When I go on a gluster pod and do gluster volume info I see one volume heketidbstorage. I believe it's also using this volume:
Might this have to do with version differences? This is openshift 1.5.1 and the gluster pods seem to be gluster/gluster-centos:latest (nothing I did manually here AFAIK).
Storageclass is:
Does someone have a clue what's going on here? There doesn't seem to be logging for heketi :s
Thanks in advance.