openfrontier / gerrit-docker

Operational scripts for docker-gerrit project.
Apache License 2.0
16 stars 19 forks source link

Internal users added by gerrit-create-user.sh won't be effective until the restart #4

Open thinkernel opened 7 years ago

thinkernel commented 7 years ago

Logs show that user jenkins are added successfully but the exists check after that failed.

2017/8/30 17:13:44[2017-08-30 09:13:44,187] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.14.3 ready 2017/8/30 17:13:46Creating user: admin 2017/8/30 17:13:46LDAP user admin was found in database 2017/8/30 17:13:46Testing Gerrit Connection 2017/8/30 17:13:46Creating user: jenkins 2017/8/30 17:13:46Target group was not specified, defaulting to non-interactive 2017/8/30 17:13:47User jenkins was created 2017/8/30 17:13:47Testing Jenkins Connection & Key Presence 2017/8/30 17:13:47Retrieving value: id_rsa.pub 2017/8/30 17:13:47Checking if "jenkins" exists 2017/8/30 17:13:47User does not exist: jenkins

Login to the Gerrit as the admin and check the users in the Non-Interactive Group shows an Anonymous Coward instead of a jenkins user has alread been added in the group.

Exec a command like below in the container echoes a 404 as the return code. curl --output /dev/null --silent --write-out "%{http_code}" "http://localhost:8080/gerrit/accounts/jenkins"

Try to reindex the accounts by using the command below gives an exception shows a write lock.

su-exec ${GERRIT_USER} java ${JAVA_OPTIONS} ${JAVA_MEM_OPTIONS} -jar "${GERRIT_WAR}" reindex --verbose --index accounts -d "${GERRIT_SITE}"

[2017-08-30 09:40:23,978] [main] INFO com.google.gerrit.server.git.LocalDiskRepositoryManager : Defaulting core.streamFileThreshold to 240m [2017-08-30 09:40:25,858] [main] INFO com.google.gerrit.server.cache.h2.H2CacheFactory : Enabling disk cache /var/gerrit/review_site/cache Exception in thread "main" com.google.inject.ProvisionException: Unable to provision, see the following errors:

1) Error injecting constructor, org.apache.lucene.store.LockObtainFailedException: Lock held by another program: /var/gerrit/review_site/index/accounts_0004/write.lock at com.google.gerrit.lucene.LuceneAccountIndex.(LuceneAccountIndex.java:95) while locating com.google.gerrit.server.index.account.AccountIndex annotated with @com.google.inject.internal.UniqueAnnotations$Internal(value=3)

1 error at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1028) at com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:776) at com.sun.proxy.$Proxy24.create(Unknown Source) at com.google.gerrit.server.index.SingleVersionModule$SingleVersionListener.start(SingleVersionModule.java:90) at com.google.gerrit.server.index.SingleVersionModule$SingleVersionListener.start(SingleVersionModule.java:71) at com.google.gerrit.lifecycle.LifecycleManager.start(LifecycleManager.java:92) at com.google.gerrit.pgm.Reindex.run(Reindex.java:95) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:204) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:108) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:63) at Main.main(Main.java:24) Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by another program: /var/gerrit/review_site/index/accounts_0004/write.lock at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:118) at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:776) at com.google.gerrit.lucene.AutoCommitWriter.(AutoCommitWriter.java:35) at com.google.gerrit.lucene.AutoCommitWriter.(AutoCommitWriter.java:31) at com.google.gerrit.lucene.AbstractLuceneIndex.(AbstractLuceneIndex.java:111) at com.google.gerrit.lucene.LuceneAccountIndex.(LuceneAccountIndex.java:95) at com.google.gerrit.lucene.LuceneAccountIndex$$FastClassByGuice$$7fe9e296.newInstance() at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89) at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:111) at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:90) at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:268) at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1019) at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085) at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1015) ... 15 more

This issue only happens in 2.14.x. The 2.13.x looks good.

avoidik commented 7 years ago
Anonymous Coward
Username that this displayed in the Gerrit WebUI and in e-mail notifications if the full name of the user is not set.

Did you set the full name?

avoidik commented 7 years ago

How do you run this scripts? Via nohup, or one after another? Sounds like race condition

thinkernel commented 7 years ago

Yes, it's a nohup. I use the gerrit-init.nohup to add the jenkins user with full name provided.

thinkernel commented 7 years ago

After the restart, the full name of the Jenkins user can be showed correctly.

avoidik commented 7 years ago

Could you try to do this without nohup, but with awaiting in place. Just one after another, to exclude any race conditions.

Also it might be great idea to put scripts with leading numbers like:

Also problem could be in the location of the scripts execution. I've slightly modified my copy of entrypoint.sh

...
  echo
  for f in /docker-post-init.d/*; do
    case "$f" in
      *.sh)    echo "$0: running $f"; source "$f" ;;
      *.nohup) echo "$0: running $f"; nohup  "$f" & ;;
      *)       echo "$0: ignoring $f" ;;
    esac
    echo
  done
fi
exec "$@"
\EOF

And Dockerfile

...
RUN mkdir /docker-entrypoint-init.d
RUN mkdir /docker-post-init.d
...

This way I'm convinced that some of the scripts had run before, and other part are after Gerrit startup.

thinkernel commented 7 years ago

To add a leading number is a good idea. I will give it a try tomorrow. All the user adding functions goes through the REST API of gerrit. I really can't figure out how can the race conditions happen. Btw, I need more details about the /docker-post-init.d. How can you make these scripts under this directory executed after gerrit-start.sh?