feroult / yawp

Kotlin/Java API framework for Google Appengine
http://yawp.io
MIT License
131 stars 19 forks source link

Always fetch by id #125

Closed luanpotter closed 4 years ago

coveralls commented 4 years ago

Coverage Status

Coverage decreased (-0.04%) to 64.198% when pulling 7e21c07086ea401f16c574351b4e586e15572ce0 on fetch-by-id into b90deb905edd3fdb3009a5525e310cd17ead7f3d on master.

feroult commented 4 years ago

@luanpotter thanks, this is great to have this kind of cache system on yawp ;)

I think the only problem is that we are calling fetch for each id sequentially.

If we use this get(Iterable) datastore api it will do the fetches in parallel giving the same performance as before.

Google probably does the same when we query for objects, since the entities are spread over multiple servers.

Thanks again!

luanpotter commented 4 years ago

Totally agreed, @feroult , missed out that entirely. However, it might be a bit hard to implement in the current schema because I can't know beforehand if the record will have a forcedResult or not before querying so that I could split then into the ones I send to datastore and the ones I fetch locally. Take a look at the following implementation example:

    private List<T> executeQuery() {
        List<IdRef<T>> ids = doIdsIgnoringPostFilterAndOrder();

        // this map will keep track of every id we need to fetch
        Map<IdRef<T>, T> fetchMap = ids.stream().collect(HashMap::new, (map, id) -> map.put(id, null), HashMap::putAll);
        // this list is the ones we are going to need to fetch from datastore
        List<IdRef<T>> listToFetch = new ArrayList<>();

        fetchMap.entrySet().stream().map(Map.Entry::getKey).forEach(e -> {
            if (Cache.has(e)) {
                fetchMap.put(e, Cache.get(e));
            } else if (hasForcedResults?) {
                // has forced results? I don't know...
                fetchMap.put(e, forcedResult);
                                // I could create a "fake" QueryBuilder and call RepositoryHooks.beforeQuery by hand and check for forcedResults, I guess...
            } else {
                listToFetch.add(e);
            }
        });

        // fetch the ones not found on cache using a new fetchAll driver method
        fetchMap.putAll(r.driver().query().fetchAll(listToFetch));
        List<T> objects = new ArrayList<>(fetchMap.values());
        return postFilter(objects);
    }

See the missing bit? So I think there are three options, (1) do the suggestion in the comment, I think it will work but looks a little bit dirty to me, or (2) change the forcedResult APIs for something else even more generic so that it could be used here somehow, or (3) actually integrate my CacheHook into yawp, i.e., no longer use a hook, so that every application has a memcache cache option built-in (given proper specification on the yaml, of course), and get rid of the forcedResult APIs altogether.

What do you think?

luanpotter commented 4 years ago

Something like this is option (1) (now very much cleaned up from the example before):

    private List<T> executeQuery() {
        List<IdRef<T>> ids = doIdsIgnoringPostFilterAndOrder();

        // this map will keep track of every id we need to fetch
        Map<IdRef<T>, T> fetchMap = new HashMap<>();
        // this list is the ones we are going to need to fetch from datastore
        List<IdRef<T>> listToFetch = new ArrayList<>();

        ids.forEach(e -> {
            if (Cache.has(e)) {
                fetchMap.put(e, Cache.get(e));
            } else {
                QueryBuilder<T> q = new QueryBuilder<>(clazz, r).whereById("=", e);
                q.r.namespace().set(getClazz());
                q.executedQueryType = QueryType.FETCH;
                RepositoryHooks.beforeQuery(q);
                q.r.namespace().reset();
                if (q.hasForcedResponse()) {
                    fetchMap.put(e, q.getForcedResultFetch());
                } else {
                    listToFetch.add(e);
                }
            }
        });

        // fetch the ones not found on cache using a new fetchAll driver method
        fetchMap.putAll(r.driver().query().fetchAll(listToFetch));
        List<T> objects = new ArrayList<>(fetchMap.values());
        return postFilter(objects);
    }
luanpotter commented 4 years ago

We won't use this approach.