Closed qasim closed 8 years ago
The goal is to house all the things we do in each filter file and abstract it. It will be similar to how an actual language interprets syntax.
Happening at cobalt/filter-revamp.
Here's what I've got as for a preliminary filter endpoint function under the new QueryParser
model:
https://github.com/cobalt-uoft/cobalt/blob/filter-revamp/src/api/buildings/routes/filter.js
On average, filter requests in the new model fair slightly faster than the current stable release (0.4.3), tested using Nodejs 6.0 for both (~100ms difference testing 100 sequential requests, averaged over 10 attempts). Not so significant, but it's good to know its not slower.
I still haven't addressed things that require MapReduce. I'm looking into using MongoDB's new aggregate functions and whether they are speedier. Will report back as soon as I get something conclusive.
QueryParser
https://github.com/cobalt-uoft/cobalt/blob/filter-revamp/src/api/utils/query-parser/index.js
courses/filter
https://github.com/cobalt-uoft/cobalt/blob/filter-revamp/src/api/courses/routes/filter.js
https://github.com/cobalt-uoft/cobalt/blob/filter-revamp/src/api/courses/routes/filterMapReduce.js
@kshvmdn this is what query parsing + a mapreduce looks like under new model. What do you think? I'm exhausted from looking at this so please help me dig around and see if we can simplify this at all ._.
Only got to take a brief look at this (will test in depth when I get the chance), it looks good so far.
I'm assuming date/time parsing is not complete yet (unless the plan is to ignore invalid input, in which case, this value needs to be returned).
I think mapreduce looks fine, we might be able to wrap filter comparisons into a function so we don't have to repeat code for arrays and non-arrays. Other than that, I don't think there's much else we can do.
If I think of anything, I'll let you know
@kshvmdn some good news on the date parsing, I found out that we can do the number operations on strings in the case of dates and MongoDB will handle it for us as long as both the comparators are strings (which in this case they are). That means I retired date_num
and we don't have to deal with that mess in tests anymore too.
I've also added throwing appropriate errors, better to tell the user something went wrong I'd think.
The filter code is probably some of the oldest living code in the entire project; it was one of the first things Ivan and I worked on back in 2014. It worked well with one or two endpoints, but we're almost at 10 now and it's about time to revisit this.
I'm going to start working on a query tokenizer module along with a token parser, laying out the groundwork for all future APIs to follow (and we will slowly move old filter code to the new one).
Here's what I'm thinking so far:
date:>"2016-04-28"
,code:-"CSC108"
)AND
and then splits second on each of those withOR
(sinceAND
takes precedence)YYYY-MM-DD
HH:MM
or simply just seconds until midnight