pushkin-consortium / pushkin

A customizable, scalable ecosystem for massive online psychological experiments
https://pushkin-consortium.github.io/pushkin/
MIT License
24 stars 11 forks source link

Why does `pushkin start` sometimes give that weird error, sometimes not #225

Closed jkhartshorne closed 8 months ago

jkhartshorne commented 1 year ago

Problem started yesterday. Seems probabilistic.

jessestorbeck commented 1 year ago

For documentation purposes, the error looks like this when you try to run an experiment in localhost:

Screen Shot 2023-07-25 at 2 08 04 PM

Some strategies that maybe mitigate this are running pushkin stop; pushkin prep; pushkin start and simply waiting a few minutes.

jessestorbeck commented 1 year ago

Update on how this is behaving today: pushkin stop; pushkin start is not working, but pushkin stop; pushkin prep; pushkin start is.

Also, not sure if this is related, but when I was trying to make changes to an experiment while testing pushkin-consortium/pushkin-exptemplates-lexical#4, running pushkin stop; pushkin prep; pushkin start wasn't enough to show updates to the experiment. I had to do pushkin armageddon; pushkin prep; pushkin start.

ellissc commented 1 year ago

Running into the same issue. pushkin stop; pushkin kill followed by pushkin prep; pushkin start also seems to cause prep to get stuck in a loop Waiting for test transaction db....

pushkin armageddon; pushkin prep; pushkin start works for me as well, though it still is probabilistic in whether it runs into the runtime error.

jessestorbeck commented 1 year ago

Some additional data points on this issue:

jkhartshorne commented 1 year ago

Suggested order of operations for figuring this out::

  1. Figure out what exactly isn't starting. Is it the worker? Or something else that the worker is waiting on?
  2. Check how docker rebuilds are handled. Something going on there?
  3. Check how npm rebuilds are handled. Is there an issue there?
  4. Searching for other possible explanations.
jessestorbeck commented 9 months ago

It turns out the issue here relates to the site's pushkin/front-end/src/.env.js file and the setEnv() function in pushkin-cli/commands/prep/index.js. Apparently all this function does is set the debug environment variable. It's called in a couple places, but importantly during install experiment, where debug gets set to false, and start, where it's set to true.

It's seemingly critical for something during prep that debug be true. I have not yet looked to see why exactly, but it probably has something to do with building the Docker images. The explanation for the pattern of errors is:

  1. You add an experiment and debug gets set to false.
  2. You run prep with debug as false, creating the problem.
  3. You run start, which sets debug to true, but the problem has already been introduced into Docker.
  4. No experiments work, and you run stop.
  5. You run prep with debug now as true, and the problem is resolved.
  6. You run start, and all experiments work.
  7. If you add another experiment, the cycle restarts.

I'm not sure what the purpose of the debug env var is in the first place. Until we can decide on the real fix, I can get new experiments to run correctly the first time on localhost by manually editing pushkin/front-end/src/.env.js to set debug to true AFTER install experiment but BEFORE prep.

hunterschep commented 9 months ago

This snippet of code from pushkin/front-end/src/config.js is particularly relevant to the issue

if (debug) {
  // Debug / Test
  const rootDomain = 'http://localhost';
  apiEndpoint = rootDomain + '/api';
  frontEndURL = rootDomain + '/callback';
  logoutURL = rootDomain;
} else {
  // Production
  const rootDomain = pushkinConfig.info.rootDomain;
  if (pushkinConfig.apiEndpoint) {
    //What's in the YAML can override default
    apiEndpoint = pushkinConfig.apiEndpoint
  } else{
    apiEndpoint = 'https://api.' + rootDomain;    
  }
  frontEndURL = 'https://' + rootDomain + '/callback';
  logoutURL = 'https://' + rootDomain;
}

When debug is set to true, the application is configured to use localhost for various URLs, functioning correctly in a local development environment. However, when debug is set to false for production, it relies on pushkinConfig and rootDomain for setting API endpoints and URLs. This production configuration may have issues such as incorrect settings or problems with pushkinConfig loading, leading to runtime errors or connectivity issues in the production environment.

jessestorbeck commented 9 months ago

Fixed in 42e4617d4dd2c4dcf1e5702086e51cf03dbbffe8 and soon to be released in pushkin-cli v4.0.

In the next release, setEnv will only be called once during install site and set debug=true. You will be able to set debug=false by running pushkin prep --production.

Experiments now work for me the first time, and when I run pushkin prep --production, I reach the familiar error during local testing.