Closed MColbrook closed 1 year ago
Hi Matthew
There are a few places in the code when n = 100 is the cutoff for using a direct method over a fast method or for persistently storing arrays. If you can send a minimal working example then I can take a look to see precisely where the bottleneck is in your case.
Thanks Nick! Here is the code, for smaller epsilon the PDE becomes more singular and hence the need for large n. Code snippet:
rng(1)
%% Parameters
order=2;
X=-0.1:0.01:1.1; % REDUCE SIZE IF NEEDED FOR TEST CASE
ep_vec=0.01; % 10.^(-2:0.1:0)
lambda = 0.1; % lengthscale of random function
dd = 1; % geometry parameter
g = randnfun2(lambda,[0,1+dd,0,1]);
rhs = @(x,y) g(x,y); % function g
[pts,al]=rational_kernel(order,'equi');
hh=0;
n=100;
dom = ultraSEM.Domain.quad([0 0 ; 1+dd 0 ; 1 1; 0 1]);
if hh>0
dom=refine(dom,hh);
end
rhs2 = ultraSEM.Sol(rhs, n, dom);
%% Compute smoothed measure at evaluation points
mu=zeros(length(X),length(ep_vec));
for lll=1:length(ep_vec)
epsilon = ep_vec(lll);
for i=1:length(X)
I=0;
for ii=1:order
% solve systems using ultraSEM
z=X(i)-pts(ii)*epsilon;
pdo={{@(x,y) z+0*x+0*y,@(x,y) 0*x+0*y,@(x,y) z-1+0*x+0*y},...
{@(x,y) 0*x+0*y,@(x,y) 0*x+0*y},@(x,y) 0*x+0*y};
op=ultraSEM(dom, pdo, rhs, n); bc = 0;
sol=op \ bc;
I=I+imag(al(ii)*sum2(sol.*rhs2));
end
mu(i,lll)=-I/pi;
end
end
%%
mu = mu/sum(mu*(X(2)-X(1)));
%% Plot
plot(X,mu)
function [poles,res] = rational_kernel(m,type)
%%rational_kernel
% Inputs:
% - order of kernel 'm'
% - Pole location 'roots','cheb'
% Outputs: - poles of rational function 'poles'
% - residues of rational function 'res'
%Poles
if type=="cheb"
z=1i+chebpts(m,1); %Chebyshev points
%z=1i+(2*(1:m)/(m+1)-1);%.^2.*sign((2*(1:m)/(m+1)-1));
elseif type=="roots"
z=exp(1i*pi*((1:m)'-1/2)/m); %Roots of unity
elseif type=="extrap"
z=1i*(0.5.^(0:m-1));
elseif type=="equi"
z=1i+(2*(1:m)/(m+1)-1);
end
if type=="equi" && m<7 %Hard-coded kernels from table of the paper
if m==1
res=1;
elseif m==2
res=[(1+3i)/2;(1-3i)/2];
elseif m==3
res=[-2+1i;5;-2-1i];
elseif m==4
res=[(-39-65i)/24;(17+85i)/8;(17-85i)/8;(-39+65i)/24];
elseif m==5
res=[(15-10i)/4;(-39+13i)/2;(65)/2;(-39-13i)/2;(15+10i)/4];
else
res=[(725+1015i)/(192);(-2775-6475i)/(192);(1073+7511i)/(96);(1073-7511i)/(96);(-2775+6475i)/(192);(725-1015i)/(192)];
end
else %Vandermonde matrix for poles
V=zeros(m);
for i=1:m
V(:,i)=(z).^(i-1);
end
%Get residues directly
rhs=eye(m,1);
res=transpose(V)\rhs;
end
poles=z;
end
Hi Matthew
As expected, the issue seems to be in computing the multiplication matrices for the nonconstant coefficient problem. For n <= 100 these matrices are cached, but for n > 100 they are computed from scratch each time, and that expense quickly adds up.
As a workaround, you can set
nmax_limit = 201;
nstore_limit = 201;
in line 17 of ultraSEM/+util/multmat.m
Nick
Thanks a lot Nick - I think that nailed it!
I am having issues solving on a quadrilateral with a small number of patches but large n (there are reasons for wanting to do it this way round) via ultraSEM(dom, pdo, rhs, n). The code works beautifully for n=100, but even changing to n=101 stalls it completely. Is there a reason for this, such as an internal maximum n somewhere in the code? Ideally, I would like to take n up to 200 or so, even for a single patch.