danfortunato / ultraSEM

The ultraspherical spectral element method
MIT License
29 stars 3 forks source link

Optimize the code #12

Open danfortunato opened 5 years ago

danfortunato commented 5 years ago

ultraSEM is currently slower than specdd, e.g. for variable-coefficient problems with many elements and moderate p. What optimizations can we make to ultraSEM to speed it up?

Some ideas:

@nickhale, do you see any good places in the code for optimization?

nickhale commented 5 years ago

@danfortunato, do you have a specific example we can use for benchmarking?

nickhale commented 5 years ago

Running some profile tests, my initial thoughts are

nickhale commented 5 years ago

If my understanding is correct, then since we're solving in kron form, we don't need the rank structure, so we don't need Chebfun2. All we need to do is evaluate the coefficient functions on an n*n Chebyshev grid and compute the bivariate Chebyshev expansion.

ajt60gaibb commented 5 years ago

Right, I agree. I don’t think we need the adaptive part of the chebfun2 constructor either. If we need the numerical degree, then we can just look at when the bivariate Chebyshev coefficients go to zero.

nickhale commented 5 years ago

I don't think we're currently using the adaptive part. Even non-adaptive construction is very slow. (Overheads.)

nickhale commented 5 years ago

I assume also don't want to be doing the Schur solve for non-constant coefficients?

nickhale commented 5 years ago

I did a hacky implementation. Helps a lot.

Comments:

danfortunato commented 5 years ago

A lot of the ultraSEM tests now fail and throw errors. @nickhale, any idea why? It seems to be stemming from your changes above.

nickhale commented 5 years ago

Hi @danfortunato . I'm not seeing any failures my end. Which ones aren't working for you?

ajt60gaibb commented 5 years ago

All the tests work for me, apart from test_updateRHS. Looks like I am missing the canonicalBC function.

danfortunato commented 5 years ago

Ah sorry, it was due to path and clear classes issues. The tests work now.

danfortunato commented 5 years ago

@nickhale and @ajt60gaibb, what are you thoughts on using the SVD here? Ideally we'd like to be able run with n = 100.

I think we want to do the Schur/Woodbury solve for all PDEs, including variable coefficient. If the degree required to represent the variable coefficient is less than n then the systems will still be almost banded.

nickhale commented 5 years ago

what are you thoughts on using the SVD here?

I used the SVD for simplicity of implementation. I have no objection to using ACA to get the low rank approx, other than that I didn't have time to implement it.

I think we want to do the Schur/Woodbury solve for all PDEs, including variable coefficient. If the degree required to represent the variable coefficient is less than n then the systems will still be almost banded.

Presumably this is something we can check (since we know the length of the coeffs) and decide on a case-by-case basis.

danfortunato commented 5 years ago

Presumably this is something we can check (since we know the length of the coeffs) and decide on a case-by-case basis.

I think the complexity of a non-Schur/Woodbury solve would be O(p^5) though, right?