cesium-ml / baselayer

Fully customizable (scientific, compute-intensive) web application template
http://cesium-ml.org/baselayer/
30 stars 18 forks source link

Make sure cron.py gets a valid jobs list #332

Closed profjsb closed 1 year ago

stefanv commented 1 year ago

This patch takes it from new style to old style.

profjsb commented 1 year ago

What? I don't see cfg.get anywhere...

More important, SP tests are failing because jobs is None (not a list). We need to fix this ASAP. @stefanv

guynir42 commented 1 year ago

@profjsb I am not sure why get fails here. It is supposed to do exactly what you want. Are you sure this is the only place in the code where this key is loaded?

profjsb commented 1 year ago

Look at any cron.log of recent test failures on SP--you'll why we need this fix. (I uncovered it locally when the app failed to fire for me). We haven't had a full test suite pass on main for awhile because of this bug.

image
profjsb commented 1 year ago

cfg.get only shows up 1 other place in baselayer:

$ rg "cfg.get"
services/external_logging/external_logging.py
59:    service_configs = cfg.get("external_logging", [])
profjsb commented 1 year ago

@profjsb I am not sure why get fails here. It is supposed to do exactly what you want. Are you sure this is the only place in the code where this key is loaded?

I dont know why it's failing but it is

guynir42 commented 1 year ago

@profjsb check that you are on the latest version of both skyportal and baselayer...

profjsb commented 1 year ago

Those errors are from the logs on main from the GitHub test archives

profjsb commented 1 year ago

See e.g. https://github.com/skyportal/skyportal/actions/runs/2667271046

guynir42 commented 1 year ago

I think I know what is going on. We have cron setup in baselayer but then cancel it in higher config files by leaving that key empty. That sends a None instead of missing key (which get would have successfully turned into []). The solution is not to get rid of get but to test for None explicitly.

profjsb commented 1 year ago

I think I know what is going on. We have cron setup in baselayer but then cancel it in higher config files by leaving that key empty. That sends a None instead of missing key (which get would have successfully turned into []). The solution is not to get rid of get but to test for None explicitly.

this is what my fix does -- it makes sure that jobs is a list which the remainder of the code requires.

Note this problem has been happening for at least weeks (I just checked some logs on github for random commits). It causes the "model logic" test to fail a lot.

profjsb commented 1 year ago

Ok, @guynir42, used your fix. Merge and pin?

guynir42 commented 1 year ago

@profjsb yeah it looks good. Go ahead and merge+pin.

stefanv commented 1 year ago

This is fine; the easy fix would have been to fix the config file that set the cron value to None instead of to [] (as it should)`.

guynir42 commented 1 year ago

This is fine; the easy fix would have been to fix the config file that set the cron value to None instead of to [] (as it should)`.

I told you the user will always find a way to screw up the config file in ways you didn't anticipate. Even when the user is us.