danfortunato / ultraSEM

The ultraspherical spectral element method
MIT License
29 stars 3 forks source link

Bound on N? #42

Closed MColbrook closed 1 year ago

MColbrook commented 1 year ago

I am having issues solving on a quadrilateral with a small number of patches but large n (there are reasons for wanting to do it this way round) via ultraSEM(dom, pdo, rhs, n). The code works beautifully for n=100, but even changing to n=101 stalls it completely. Is there a reason for this, such as an internal maximum n somewhere in the code? Ideally, I would like to take n up to 200 or so, even for a single patch.

nickhale commented 1 year ago

Hi Matthew

There are a few places in the code when n = 100 is the cutoff for using a direct method over a fast method or for persistently storing arrays. If you can send a minimal working example then I can take a look to see precisely where the bottleneck is in your case.

MColbrook commented 1 year ago

Thanks Nick! Here is the code, for smaller epsilon the PDE becomes more singular and hence the need for large n. Code snippet:

rng(1)
%% Parameters
order=2;
X=-0.1:0.01:1.1; % REDUCE SIZE IF NEEDED FOR TEST CASE
ep_vec=0.01; % 10.^(-2:0.1:0)
lambda = 0.1; % lengthscale of random function
dd = 1; % geometry parameter

g = randnfun2(lambda,[0,1+dd,0,1]);
rhs = @(x,y) g(x,y);  % function g
[pts,al]=rational_kernel(order,'equi');

hh=0;
n=100;

dom = ultraSEM.Domain.quad([0 0 ; 1+dd 0 ; 1 1; 0 1]);
if hh>0
    dom=refine(dom,hh);
end

rhs2 = ultraSEM.Sol(rhs, n, dom);

%% Compute smoothed measure at evaluation points
mu=zeros(length(X),length(ep_vec));

for lll=1:length(ep_vec)
    epsilon = ep_vec(lll);
    for i=1:length(X)
            I=0;
            for ii=1:order
                % solve systems using ultraSEM
                z=X(i)-pts(ii)*epsilon;
                pdo={{@(x,y) z+0*x+0*y,@(x,y) 0*x+0*y,@(x,y) z-1+0*x+0*y},...
                    {@(x,y) 0*x+0*y,@(x,y) 0*x+0*y},@(x,y) 0*x+0*y};
                op=ultraSEM(dom, pdo, rhs, n); bc = 0;
                sol=op \ bc;        
                I=I+imag(al(ii)*sum2(sol.*rhs2));
            end
            mu(i,lll)=-I/pi;
    end
end
%%
mu = mu/sum(mu*(X(2)-X(1)));

%% Plot
plot(X,mu)

function [poles,res] = rational_kernel(m,type)
%%rational_kernel 
% Inputs: 
%         - order of kernel 'm'
%         - Pole location 'roots','cheb'
% Outputs: - poles of rational function 'poles' 
%         - residues of rational function 'res'

%Poles
if type=="cheb"
    z=1i+chebpts(m,1);  %Chebyshev points
    %z=1i+(2*(1:m)/(m+1)-1);%.^2.*sign((2*(1:m)/(m+1)-1));
elseif type=="roots"
    z=exp(1i*pi*((1:m)'-1/2)/m); %Roots of unity
elseif type=="extrap"
    z=1i*(0.5.^(0:m-1));
elseif type=="equi"
    z=1i+(2*(1:m)/(m+1)-1);
end

if type=="equi" && m<7 %Hard-coded kernels from table of the paper
    if m==1
        res=1;
    elseif m==2
        res=[(1+3i)/2;(1-3i)/2];
    elseif m==3
        res=[-2+1i;5;-2-1i];
    elseif m==4
        res=[(-39-65i)/24;(17+85i)/8;(17-85i)/8;(-39+65i)/24];
    elseif m==5
        res=[(15-10i)/4;(-39+13i)/2;(65)/2;(-39-13i)/2;(15+10i)/4];
    else
        res=[(725+1015i)/(192);(-2775-6475i)/(192);(1073+7511i)/(96);(1073-7511i)/(96);(-2775+6475i)/(192);(725-1015i)/(192)];
    end
else %Vandermonde matrix for poles
    V=zeros(m);
    for i=1:m 
        V(:,i)=(z).^(i-1);
    end

    %Get residues directly
    rhs=eye(m,1);
    res=transpose(V)\rhs;
end

poles=z;

end
nickhale commented 1 year ago

Hi Matthew

As expected, the issue seems to be in computing the multiplication matrices for the nonconstant coefficient problem. For n <= 100 these matrices are cached, but for n > 100 they are computed from scratch each time, and that expense quickly adds up.

As a workaround, you can set

nmax_limit   = 201;
nstore_limit = 201;

in line 17 of ultraSEM/+util/multmat.m

Nick

MColbrook commented 1 year ago

Thanks a lot Nick - I think that nailed it!