LiUGraphQL / LinGBM-OptimizationTechniquesExperiments

Artifacts of LinGBM-based experiments to compare optimization techniques for GraphQL server implementations
0 stars 0 forks source link

Bug in undergraduateDegreeObtainedBystudent #1

Open hartig opened 4 years ago

hartig commented 4 years ago

The resolver for undergraduateDegreeObtainedBystudent is incorrect, at least for the cache-variation of the test server.

Look at line 267 of src/cache/resolver/resolver.js. Here, the resolver applies the offset and the limit before applying the filter. That was not the intention. Instead, the filter has to be applied first.

The same problem may exist in the other server variations as well, and maybe also in other resolvers (I have not checked).

/cc @chengsijin0817

hartig commented 4 years ago

Just checked: the resolver for undergraduateDegreeObtainedBystudent in the batch+cache server has the same bug.

chengsijin0817 commented 4 years ago

@hartig I agree. For the resolver implementation, the filter has to be applied first. I can fix this issue.

However, if only considering our existing query template. I think it does not affect the performance of the GraphQL server. The related query templates are QT7, QT9, QT13, and QT14. For these queries, the filter and limit/offset does not appear in the same template at the same time.

For example, when executing queries of QT13 or QT14, the line line 267 and line 268 will be skipped. Am I understand correctly?

hartig commented 4 years ago

In fact, it is even worse!

The resolver first fetches all the grad students who got their undergrad degree from the given university, no matter whether there is a filter condition or not. After that, offset and limit are applied to the resulting list of all these students. Next, if there is a filter condition, the resolver fetches all the students again, but this time together with data about their advisors. Hence, the first fetch is completely useless and unnecessary work (in fact, the second fetch cannot even benefit from caching the first fetch because these fetches are a bit different--once without advisors and once with advisor data--and, thus, they are in different caches).

But this unnecessary work is not the only problem: After fetching again the whole list of grad students, and filtering them, the resolver does not apply the offset and the limit!!

Besides these issues, there are two identical copies of code that implements the filter. This is such a bad programming style!

Moreover, the implementation is totally inefficient---the filter conditions, as well as the offset+limit, should be pushed into the SQL query instead of fetching all grad students first, and then applying the filter in the GraphQL server.

I assume that all these issues are present in all the server variants :-(

hartig commented 4 years ago

However, if only considering our existing query template. I think it does not affect the performance of the GraphQL server.

It does. See my follow-up comment.

At least, all server variants are affected in the same way (which is different for the bug in #2). In this sense, fixing this issue here is not the highest priority at the moment (in contrast to #2)

For example, when executing queries of QT13 or QT14, the line line 267 and line 268 will be skipped. Am I understand correctly?

These two lines should be moved to before the final return statement. Moreover, the assignment in line 265 should be moved into the main else block (but without let -- see #2). And these changes need to be done for all server variants. Additionally, the main if block can be optimized much more (see my comment above) and the code duplication should be removed in there.

chengsijin0817 commented 4 years ago

These two lines should be moved to before the final return statement. Moreover, the assignment in line 265 should be moved into the main else block (but without let -- see #2). And these changes need to be done for all server variants. Additionally, the main if block can be optimized much more (see my comment above) and the code duplication should be removed in there.

You are right. All these issues are present in all server variants. I can change the order of offset+limit and filter for all server variants first, and then, I can optimize the code according to your comments. Currently, I am doing experiments on changing the number of database connections. To make the measurement results be comparable with previous results. I keep this issue at the moment and fix it after finishing that experiment.