NYUCCL / psiTurk

An open platform for science on Amazon Mechanical Turk.
https://psiturk.org
MIT License
277 stars 140 forks source link

Feature: Adding support for stream logging (only used when ON_CLOUD=1) #482

Closed jmuchovej closed 3 years ago

jmuchovej commented 3 years ago

I was recently trying to debug different local application behavior vs Heroku-hosted behavior. (One of my custom routes was breaking on Heroku, but not locally.)

However, since ON_CLOUD will reset psiTurk's logging, this made it a nightmare to debug. https://github.com/NYUCCL/psiTurk/blob/d81d3532c4cf09970a70c901671d27ad3941df09/psiturk/experiment.py#L37-L41

My changes are detailed here. Essentially, I replaced the enforced logging to a file to logging.<type>Handlers so that a StreamHandler could be added, allowing for Heroku pull psiTurk logs, since they aggregate all stdout and stderr output.

deargle commented 3 years ago

Wait I'm confused, but I admit to being a noob with logging, and I know there's some head scratching stuff in psiturk rn with logging. At one point that logging was disabled so that the SECRET_KEY didn't get logged to the heroku logs, but on second thought there's no need to log that anyway. psiturk_server.py sets the logger to be stderr, and I thought there was code that bound the flask logger to gunicorn's when gunicorn is detected. How do your changes fit in with that? Do you see the part I'm talking about? Thanks for helping with this btw, again the current state is a mess.

jmuchovej commented 3 years ago

Hmm... Let me roll back the experiment to a state where I should have the breakdown. Give me a few minutes. 😅

I don't recall seeing anything from gunicorn nor flask, but will check back.

jmuchovej commented 3 years ago

Rollback

Okay, so, rolling back to a few commits before I implemented the change – I see only logs from the following:

Trying the erroneous path, I don't see Heroku's logs raising the error through their dashboard (https://dashboard.heroku.com/apps/<app>/logs).

psiturk_server.py?

I don't see a psiturk_server.py file anywhere, so I'll assume you meant experiment_server.py? If so, I don't see what you're mentioning. Presumably you're referring to this? https://github.com/NYUCCL/psiTurk/blob/e53943af5e3bce2e14b02b8a9c86b4eefc23be1e/psiturk/experiment_server.py#L84-L94 I'm not sure that actually sets anything? Doing a quick search over gunicorn and flask, I don't see any meaningful mention of a user_options (or options) variable.

Looking over gunicorn and flask, I see everything you mentioned; so now I'm just as puzzled. 😂 Maybe this has something to do with initialization order?

What I do know:

What I'd guess:

jmuchovej commented 3 years ago

Okay, I think I know what's going on. When I modified psiTurk to only setup logging when $ON_CLOUD is unset, whatever's normally logging (gunicorn, presumably?) dumps the expected errors to Heroku's logs. So, it seems that doing logging.basicConfig(filename=None) is what's causing the breakdown.

I also have a nagging feeling that it might be good to specify log files however gunicorn/flask expects, but I don't know how to do that. (Nor if it's a super high priority for you/psiTurk. 😅)

deargle commented 3 years ago

This is the gunicorn logging binding thing I was remembering: https://github.com/NYUCCL/psiTurk/blob/master/psiturk/experiment.py#L58

I'll help think over this too as early as next week. Thanks again for your help.

deargle commented 3 years ago

Are you using app.logger to write to logs? https://flask.palletsprojects.com/en/1.1.x/logging/

deargle commented 3 years ago

I'm thinking we maybe just need to set flask's logging level to be the same as gunicorn, when gunicorn is being used: https://trstringer.com/logging-flask-gunicorn-the-manageable-way/

The code I linked two comments up is an attempt to detect whether the app is being run directly as a flask application (not supported by psiturk, but I do this sometimes when I want live template reloading) or if it's being run via gunicorn.

jmuchovej commented 3 years ago

No, I wasn't explicitly writing to logs. I was trying see what the exceptions being thrown were.

deargle commented 3 years ago

I should point out that if you have not set any value for log file in your config.txt, then stout and stderr will be used by default if ON_CLOUD=1. This differs from previous psiturk versions which would ignore any log file setting in config.txt and just use stout/stderr if ON_CLOUD=1. Config files generated by new psiturk versions comment out all defaults in the config.txt file generated by the psiturk-setup-example command, so a migrated config file might lead to unexpected logging results. Is that your case?

jmuchovej commented 3 years ago

I believe this is my case, but to be sure – I'm using psiTurk 3.0.4 and have set ON_CLOUD in Heroku. Here's my config, if it's helpful:

config.txt ``` [HIT Configuration] title = description = keywords = Perception, Psychology lifetime = 24 us_only = true approve_requirement = 95 number_hits_approved = 100 require_master_workers = false browser_exclude_rule = MSIE, mobile, tablet allow_repeats = false ad_url_domain = [Database Parameters] table_name = [Server Parameters] loglevel = 2 debug = false threads = 1 [Task Parameters] experiment_code_version = num_conds = 5 allow_repeats = false num_counters = 1 contact_email_on_error = cutoff_time = 40 [Shell Parameters] launch_in_sandbox_mode = false persistent_history_file = .psiturk_history ```
deargle commented 3 years ago

I don't see the old logfile=server.log in there at all, did you remove it just now?

On Thu, Mar 25, 2021, 11:40 AM John Muchovej @.***> wrote:

I believe this is my case, but to be sure – I'm using psiTurk 3.0.4 and have set ON_CLOUD in Heroku. Here's my config, if it's helpful: config.txt

[HIT Configuration]

title =

description =

keywords = Perception, Psychology

lifetime = 24

us_only = true

approve_requirement = 95

number_hits_approved = 100

require_master_workers = false

browser_exclude_rule = MSIE, mobile, tablet

allow_repeats = false

ad_url_domain =

[Database Parameters]

table_name =

[Server Parameters]

loglevel = 2

debug = false

threads = 1

[Task Parameters]

experiment_code_version =

num_conds = 5

allow_repeats = false

num_counters = 1

contact_email_on_error =

cutoff_time = 40

[Shell Parameters]

launch_in_sandbox_mode = false

persistent_history_file = .psiturk_history

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/482#issuecomment-807160928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI6Y7I4KGZURFO2BNTEPFLTFNYPTANCNFSM4ZYH42CQ .

jmuchovej commented 3 years ago

I removed it before deploying with psiTurk 3 and using Heroku. (I can go back to a prior deployment that should be on psiTurk 2 and check for logged errors, if that would be helpful.)

deargle commented 3 years ago

No need, let me do some testing later to try to replicate.

On Thu, Mar 25, 2021, 11:58 AM John Muchovej @.***> wrote:

I removed it before deploying with psiTurk 3 and using Heroku. (I can go back to a prior deployment that should be on psiTurk 2 and check for logged errors, if that would be helpful.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/482#issuecomment-807190364, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI6Y7LS3BPHNIMH63R7BZLTFN2SPANCNFSM4ZYH42CQ .

jmuchovej commented 3 years ago

Looks like this can be closed in #483 was merged?