SubjectRefresh / refresh

A Machine Learning Question Generator
Other
4 stars 3 forks source link

Creating a caching mechanism #10

Open OliCallaghan opened 9 years ago

OliCallaghan commented 9 years ago

Caching

What is needed

Ok, so in order to stop people having to refresh the whole course every time that a new question is created, I'm going to create a caching mechanism, which involves the user sending a syllabusNumber, and a pointNumber, and then the caching mechanism will output the appropriate data.

How?

The caching mechanism will use MongoDB and "poof!" as if by magic, it will work

Why do we need it?

In order to stop the Node pixies running around like loonies on all of the sources which we scrape from, we need to build a caching mechanism to make the pixies only break a sweat once.

Oli, Why are you posting all this stuff here?

I am posting all this stuff here, because if anyone wants to know how the mechanism works in the future, whether it be improving or integrating other code into it, then this issue will be the place to go.

Well stop writing this, and make it first!

Good point, OK.

developius commented 9 years ago

@OliCallaghan Excellent. One thing, why don't you just check if the file that we create to hold the data doesn't already exist in files/? We store everything in there so you might as well just pull it out.

OliCallaghan commented 9 years ago

That is a good idea, but we still go and scrape hundreds of websites in order to create the gap fill exercises. In order to make it fast, and simple, and also, be able to get the next question immediately after completion of a current gap fill, then a full caching mechanism is the way forward

developius commented 9 years ago

@OliCallaghan yep, although me might as well just store that data in files/finished/ to keep it super quick on the server end and avoid Mongo faff.

OliCallaghan commented 9 years ago

That's a good idea, but the issue with store files, is that IO stuff is about the same speed as Mongo faff, but when we use Mongo, we get much more space to store the data. Also, then if we were to scale, then using a Mongo server means we don't have to copy files, or run a shared data repository.

developius commented 9 years ago

@OliCallaghan true true. Get started then :stuck_out_tongue_winking_eye:

popey456963 commented 9 years ago

Node.JS has great MongoDB integration, I'd highly suggest using that over MySQL, a fairly archaic solution.

It'd also allow us hosting the database and the server on the same computer easily, as MongoDB allows programmatic installation.

developius commented 9 years ago

@popey456963 who said MySQL was archaic?! But yeah, Mongo might be a good idea. I know how to move/update/backup MySQL though :wink:

popey456963 commented 9 years ago

With Mongo you have the idea of --drop with Mongorestore (mongorestore --drop [path to dump]). You can backup using the same method. Programmatically, Node supports Mongo much better, allowing you to utilise it with equal ease to Meteor (almost...)

If you would like, I can do the link between the .js and the database, and you focus on the caching part, as I have some past experience with it?

developius commented 9 years ago

@popey456963 could you do that for us?

popey456963 commented 9 years ago

I am rather busy at the moment, so I doubt I would be able to do the caching logic any time soon. I could however write wrapper functions that you can just parse things you want to test.

Would the following be the only things needed:

If so, you would (theoretically), be able to do something along the lines of:

refreshMongoWrapper.write("subjects", "6145", "possibleQuestionJSONStoredOrSomething");
refreshMongoWrapper.find("subjects", "6145"); // Returns list of questions WITH ID
refreshMongoWrapper.change("id", "changeItToThis"); // Changes a field via the ID (find)
refreshMongoWrapper.delete("id"); // Deletes the field with the ID

Although we might want to think of something shorter than refreshMongoWrapper. Then it would be similar to SQL, but far quicker to run and it could be set-up programmatically.

developius commented 9 years ago

@popey456963 that'd actually be excellent and really useful!

developius commented 9 years ago

@popey456963 how far did you get with that wrapper? @OliCallaghan how's the caching engine going?

popey456963 commented 9 years ago

@developius Thought I wasn't doing the wrapper any more as @OliCallaghan has already sorted DB communications?

OliCallaghan commented 9 years ago

@developius @popey456963 I've got Mongoose working, and it now does everything that it did before with MySQL, currently, in the process of rerouting the output to a MongoDB Document.

developius commented 9 years ago

@OliCallaghan awesome. Keep us updated here.

developius commented 9 years ago

Partly finished. Awaiting question caching integration.

developius commented 9 years ago

This is 99% done. I'm changing the assignment due to the fact that I've nearly finished it.