ducoquelicot / observ_admin

All admin and back-end for Observ
0 stars 0 forks source link

connect APScheduler with the framework #15

Open zstumgoren opened 5 years ago

zstumgoren commented 5 years ago

Some running notes on integration of APScheduler with Flask. Below may not get you all the way to working state, but will add things to test as time allows.

Configuration

Per this example from Flask-APScheduler, try adding the below lines to this section of your app's __init__.py (right after you instantiate the scheduler):

scheduler.init_app(observ)
scheduler.start()

Create a test job

After you've done the configuration above, try adding a simple test job that logs to a file or the console. For example, in app/tasks.py:

# app/tasks.py
def test_job():
     from datetime import datetime
     with open('/tmp/flask_scheduler_test.txt', 'a') as log:
         log.write("{}".format(datetime.now()))

Run the test job

In one of the routes, try importing the scheduler that you configured in app/__init__.py and add the job so that it runs immediately (not on an interval). The search route is nice and simple so it may be a good place to test this behavior. The below setup is intended to log a timestamp to a file every time the search route is run. You may need to fix/tweak the code as I haven't tested it.

# routes.py
# at top of file, add imports
from app import scheduler
from app.tasks import test_job

def search():
    # add this line to your search function. 
    # omitting rest of code for brevity
    scheduler.add_job(test_job)

Verify the job ran

It's not clear to me if a record will persist in SQLAlchemy after a job is complete. It's possible the framework clears out the record for completed jobs. You'd need to research this to verify.

However, you can verify whether the job is working by looking at the '/tmp/flask_scheduler_test.txt file. You should see a new timestamp in the file every time the search route is executed.

ducoquelicot commented 5 years ago

Would this just work for the Flask-APScheduler or could I also rebuild into the more general APScheduler framework? Is there a preference? All the stuff that is in place now is based on the general APScheduler framework.

zstumgoren commented 5 years ago

@ducoquelicot Hey, i just dropped some initial ideas for testing the job scheduler behavior. These may or may not work, so let me know what happens. If you can, try making these changes on a new branch and push that branch. that way I can pull down branch and help debug if you hit any snags.

zstumgoren commented 5 years ago

woops. one correction. to the configuration section: app should be observ in your case

ducoquelicot commented 5 years ago

@zstumgoren Would this just work for the Flask-APScheduler or could I also rebuild into the more general APScheduler framework? Is there a preference? All the stuff that is in place now is based on the general APScheduler framework.

zstumgoren commented 5 years ago

I think you have most of the setup for Flask-APScheduler correct. But I think the way you're trying to add the job is not quite right. The key ingredients in the recommendations above are:

So if the approach outlined above works for this simple test_job, I think you can delete the FlaskScheduler imports/setup you're performing in routes, and then try using the same strategy for the subscription code (i.e. use the scheduler instance imported from app/__init__.py to add the jobs).

Hope that makes sense. I'll be out of pocket most rest of day, but ping back on this ticket if you hit roadblocks and I'll try to take a look tonight.

ducoquelicot commented 5 years ago

Also, I'm not sure about how to configure the route for the subscribe element in the right way. What I tried to do yesterday was refer the form action to the '/subscribe' route and then add an 'if form validate on submit', but that didn't seem to do anything because the form was of course already submitted. So now it's gone, but I don't know if that's the right way to do it - i.e. whether it's going through the motions now or not.

ducoquelicot commented 5 years ago

@zstumgoren ok, I'll see if I can make it work. I don't know how much easier or harder it's going to be to add the jobs to the right place, but we'll have to see.

zstumgoren commented 5 years ago

@ducoquelicot Yep, i think there may be some bits of trickiness still to work through. But let's try to get a simple, toy example working. Then we can dive back into the more complex, real use case in subscribe.

ducoquelicot commented 5 years ago

Because right now the way I'm adding the jobs is according to the general APScheduler framework, not according to the Flask-APScheduler framework. I think a problem with the Flask-APS might be that the jobs are listed in the Config file so I'll have to find a way to add them to the Config file. Don't know how to do that yet.

ducoquelicot commented 5 years ago

Ok! I'll see if I can get the test running.

zstumgoren commented 5 years ago

@ducoquelicot cool. yeah, i gave the configs a quick look and assumed that it was kosher to drop them where you currently have them. But let's keep that in mind as a possible variable to explore if the strategy laid out above doesn't work. If it does work, that would suggest the configs are fine and we can focus on getting the subscribe jobs to work using the same strategy.

zstumgoren commented 5 years ago

@ducoquelicot I worked through the scheduler integration on a new scheduler branch. The details of the changes can be viewed in this commit.

There's code in the routes.py and app/tasks.py related to a test_job function I created. That code should eventually be removed.

Anyhow, the scheduler now appears to be working. In other words, I scheduled the test_job to run immediately from the index page (when it loads); and then verified that the function was indeed writing timestamps to the expected file in /tmp every time i loaded the home page.

No database records were created afaict, but perhaps this only occurs when you schedule a job for the future. That would make sense.

Anyhow, please take a careful look and merge into master if everything makes sense. Then you should be able to pivot to fixing your own scheduled jobs.

zstumgoren commented 5 years ago

I just tested scheduled work on an interval and this seems to work as expected (i.e. database record is created and timestamps are written to file by test_job). Here's how I configured if you want to test as well:

# routes.py
def index():
    scheduler.add_job('test_job', test_job, trigger='interval', seconds=3)
ducoquelicot commented 5 years ago

Let me check if it works on my machine too, first. I'll let you know

ducoquelicot commented 5 years ago

Just a little thing: if I open your branch on my machine, observ.db is back in the version control. Update: and on master it's back in version control as well.

ducoquelicot commented 5 years ago

Two more things:

UPDATE: I had to delete the most recent version in the migrations folder. I'm not going to merge with your branch because pulling in your branch is, I think, what messed up my database. I'll create all the changes manually and work from there.

ducoquelicot commented 5 years ago

Got everything to work! I was a bit confused at first by the way the add_job was supposed to work but then I realized what it wanted so I'm all good. Will try to set up the actual subscriptions now and see if that works as well.

ducoquelicot commented 5 years ago

Update: I now figure out that jobs can't have the same ID. I notice the following from the docs:

If you schedule jobs in a persistent job store during your application’s initialization, you MUST define an explicit ID for the job and use replace_existing=True or you will get a new copy of the job every time your application restarts!

I'm going to try what happens if I don't put an ID in and we'll see what that does.

UPDATE: turns out it needs an id. I'll have to figure out a way to see if I can make db add an id number/name/whatever.

EDIT: another thing I'll have to make sure is that the 'tasks' table is already in the sqlite db before I add jobs to it because otherwise, every time I run flask db migrate it'll detect a table that shouldn't be there (tasks) and remove it.

I'm thinking of a way to combine the Model and the job.

zstumgoren commented 5 years ago

Excellent. Yeah, I'd construct a unique id in just the way you describe. Sounds like you're on the right path. Good stuff!

ducoquelicot commented 5 years ago

What I ended up doing was getting the newly created subscription id and use that as the task id. Makes a lot of sense, since those two are directly related.

Only issue is that now, I still create the tasks table outside Models.py which means Migrate won't recognize it. Don't know the right workaround to that since I can't create an empty table. Would it make sense to create a 'Tasks' model with the same columns that are instantiated when the table is created through APScheduler?

In other news: as you may have seen, I have been able to create the initial 'you have subscribed' email. Now working on the actual subscription emails. I'm progressing towards a way to check whether there are new results. To do that, I had to perform the search again on 'subscribe' and add the list of IDs to the subscription, as well as the total. In doing so, I can run another search in the task and if the total of that search > the stored total in the subscription, then things should happen.

What I'm working on now is a way to compare the two lists of IDs so that I can get the IDs that are new and run them against the IDs from the database so that I can return the SQLite objects, which I then can use to populate the email.

Step by step...

edit: after that, I will also have to update the value of the subscription results and total so that they can be used as the new reference. Which will probably quite easy to do but I'm just listing the steps.

ducoquelicot commented 5 years ago

@zstumgoren I've quit for the day, I'm running into an issue that I don't know how to solve atm and it's 8.30 so it's about time I quit.

Here's the issue, hope you can help.

I've set up everything pretty neatly, I dare say. When you click 'subscribe', you'll get an initial email that will acknowledge your subscription (I should tweak the text a little bit since I won't be sending an overview of the initial results, but that's minor details).

Then, when you go to tasks.py, you can see the task. It's supposed to compare the ids of the last search instance (first one being the moment you hit 'subscribe') with the current search instance, fetch the new IDs, run them through the database and get the corresponding objects that can populate the email. Then, it updates that particular subscription so that the new search results are stored.

Right now, I have set it up a bit differently than it should be set up when everything runs. For instance, I currently check whether total == sub.total while in the actual app, this should be total > sub.total because that indicates new results. Obviously, this is done for testing purposes.

Another thing that I did is prepopulate the 'when' list with some IDs, so that the email would actually get some results. Otherwise, there would be nothing in the list and I think (but didn't try this) that the code would break there.

Right now, the code breaks somewhere else: in the email department. The error message that I get is as follows:

File (xxx) in sub_job:
send_results_email(user, sub, output)
File (xxx) in send_results_email:
user=user, subscription=subscription, results=results)
File (xxx) in render_template:
ctx.app.update_template_context(context)
AttributeError: 'NoneType' object has no attribute 'app' 

You'll have to look in emails.py, this is where those functions and files are.

I've looked briefly online and from what I can see, it somehow means that you're trying to instantiate something from Flask while the app is not present there. Which is... weird... because all the other emails are working, and I can't see anything that I've done differently. But there must be something going on.

Hope you can check with me - pushed changes to master.

zstumgoren commented 5 years ago

@ducoquelicot Assuming the APScheduler work is complete and bugfree, did you merge into master and push? Any new issues related to other features such as email subscription, let's put in a new ticket.

ducoquelicot commented 5 years ago

Yes, I pushed to master! I'll copy the issue into another ticket.

Get Outlook for Androidhttps://aka.ms/ghei36


From: Serdar Tumgoren notifications@github.com Sent: Tuesday, May 14, 2019 7:38:50 AM To: ducoquelicot/observ_admin Cc: Fabienne Rosina Nicole Meijer; Mention Subject: Re: [ducoquelicot/observ_admin] connect APScheduler with the framework (#15)

@ducoquelicothttps://github.com/ducoquelicot Assuming the APScheduler work is complete and bugfree, did you merge into master and push? Any new issues related to other features such as email subscription, let's put in a new ticket.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ducoquelicot/observ_admin/issues/15?email_source=notifications&email_token=ALIJ6TCOQEHB3JWMRK7PI4LPVLFHVA5CNFSM4HMISR32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVLWCUI#issuecomment-492265809, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALIJ6THS3Z2NQMJYBL2K2QTPVLFHVANCNFSM4HMISR3Q.