Open billtubbs opened 4 years ago
I tried to look this up in the original Qu et al paper. I think this is the bit where they describe the initialization process:
In fact, good initializations are critical for the proposed ADM algorithm to succeed in the linear sparsity regime. For this purpose, we suggest using every normalized row of Y as initializations for q, and solving a sequence of p nonconvex programs (II.2) by the ADM algorithm.
Did they didn't provide the code for this bit?
Hi Bill--
My intuition is that you want to pick initializations that do a good job of 'covering' the null space. The algorithm is nonconvex, so they need to start at lots of different 'locations' in the null space. Beyond that, I used their suggestion but only tried 'normalizing' by dividing by the mean as is in the code. You are absolutely correct that subtracting the mean and dividing by the standard deviation might be what they meant or might work better. I am not aware of any theoretical results that provide guidance on this issue, but that is likely my own lack of reading on the topic. There have been many alternating directions algorithms since the Qu et al. paper so it is possible such guidance exists, and I have not followed up on new literature. My suggestion would be to either contact Qu et al. to see if they have better guidance or try the initialization both ways yourself to see if there is a difference.
Sorry not to be more help, Niall
Niall Mangan Assistant Professor Engineering Sciences and Applied Mathematics Northwestern University niallmangan.com http://niallmangan.com
On Fri, Jan 31, 2020 at 9:51 PM Bill Tubbs notifications@github.com wrote:
I tried to look this up in the original Qu et al paper. I think this is the bit where they describe the initialization process:
In fact, good initializations are critical for the proposed ADM algorithm to succeed in the linear sparsity regime. For this purpose, we suggest using every normalized row of Y as initializations for q, and solving a sequence of p nonconvex programs (II.2) by the ADM algorithm.
Presumably they didn't provide the code for this bit. Couldn't find anything more specific than that I'm afraid.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/niallmm/iSINDy/issues/1?email_source=notifications&email_token=ACNN2QGB7WDF2CPFPVBW4ULRATWVNA5CNFSM4KOQGMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQSYJQ#issuecomment-580987942, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNN2QBIFCSGMID4M3PAYWLRATWVNANCNFSM4KOQGMMA .
Thanks for responding! Much appreciated.
I don't know much about optimization so I'm not going to comment. As long as it works I guess... (The only concern I might have is what if it causes a divide-by-zero?)
Anyway, I did do some more digging for code and found this in Qu's code on the google site:
% exhausting all initializations
flag = 0;
for l = 1:p
q_0 = (Y(l,:)/norm(Y(l,:)))';
q_mtx(:,l) = ADM(Y,q_0,lambda,MaxIter,tol_adm);
cor_vec = abs(q_mtx(:,l)' * q_true);
error = 1 - cor_vec;
if(error<=tol_s)
flag = 1;
break;
end
end
(It's from the file platform_PSV.m
)
Not 100% sure I'm looking at the right thing but this looks like the section of code where the initialization is done and on the 3rd line he is dividing by the norm of the vector. So maybe that's what he meant by normalization.
Anyway, as I said it's no big deal. I'm trying to convert your code to Python (mainly as a learning exercise for a class I am taking) so I will try both methods and see if they work.
thanks Bill.
I agree with your read of their code. Good luck with the python port! Let me know if I can help with anything else!
Niall
On Tue, Feb 4, 2020, 9:39 PM Bill Tubbs notifications@github.com wrote:
Thanks for responding! Much appreciated.
I don't know much about optimization so I'm not going to comment. As long as it works I guess... (The only concern I might have is what if it causes a divide-by-zero?)
Anyway, I did do some more digging for code and found this in Qu's code on the google site https://www.dropbox.com/s/baks94hux3k9zwd/Planted%20Sparse%20Vector.zip?dl=0 :
% exhausting all initializations flag = 0;for l = 1:p q_0 = (Y(l,:)/norm(Y(l,:)))'; q_mtx(:,l) = ADM(Y,q_0,lambda,MaxIter,tol_adm); cor_vec = abs(q_mtx(:,l)' * q_true); error = 1 - cor_vec; if(error<=tol_s) flag = 1; break; endend
(It's from the file platform_PSV.m)
Not 100% sure I'm looking at the right thing but this looks like the section of code where the initialization is done and on the 3rd line he is dividing by the norm of the vector. So maybe that's what he meant by normalization.
Anyway, as I said it's no big deal. I'm trying to convert your code to Python (mainly as a learning exercise for a class I am taking) so I will try both methods and see if they work.
thanks Bill.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/niallmm/iSINDy/issues/1?email_source=notifications&email_token=ACNN2QCCPVNRWCXTIKQ7DSDRBIYFPA5CNFSM4KOQGMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK2BBYA#issuecomment-582226144, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNN2QGSPLHOFB3IGQGXKMLRBIYFPANCNFSM4KOQGMMA .
Not sure about this but just wondering why you divide by the mean here:
This is in the first few lines of ADMinitvary.m
Usually when I see the word 'normalize' it means subtracting the mean and/or dividing by the std. deviation.