pnp / modernization

All modernization tooling and guidance
http://aka.ms/sppnp-modernize
MIT License
157 stars 86 forks source link

[BUG] : ModernizationScanner never finishes scanning large amounts of sites in the tenant and writes NO logfiles #505

Closed amarnath2510 closed 4 years ago

amarnath2510 commented 4 years ago

SharePoint Modernization Scanner Version 2.12.0.0 configured to run with Azure ACS App only never finishes scanning large amounts of sites on a tenant (>10'000 Sites) and even worse, as the log file is written only at the end after running for several hours NO log files are available... Full scan never completes Scan was to find only SharePoint designer workflows

After running several hours (once approx 7 or 8 hours and once over 18 hours, first try with 10 threads, second try with 30 threads) it reached once 95% and then looped endless with the message "Retrieving a batch of up to 500 search results" As the log file is unfortunately ONLY written when it ends successfully NO logfiles are available...

We have run the site collection with batch wise (3000 sites per batch) , still we get the same error. Please let us know is there any away to generate the report. image

jansenbe commented 4 years ago

Hi @amarnath2510 ,

Would it be possible to do a Fiddler trace when the scan get's stuck in the "retrieve a batch of up to 500 search results" loop. This will help understand whether the requests are throttled or not. There's an open issue (#498) for updating the scanner to continuously persist the results

jansenbe commented 4 years ago

Issue #498 has been implemented, the scanner now will write the scan results to CSV files each minute.

Chipzter commented 4 years ago

Hi Bert, We are facing the exact same issue when scanning a mere 860 sites on one of our vanity domains, also using App ID auth. I downloaded the new release this morning, and the logs are getting written to disk periodically, but after it gets stuck in that batch retrieval loop, it seems that the output files are rewritten or touched, but there are no new rows added. I'm rerunning it with fiddler now so perhaps I will be able to provide more information tomorrow unless someone else beats me to it.

// Robert

Chipzter commented 4 years ago

Ooo 😎 ! Here's a clue: The "IndexDocId > xxx" condition in the search query seems to not work when it needs to page result sets larger than 500 items. It keeps getting the same 500 items over and over. The IndexDocId of the last item is the same every time. image

jansenbe commented 4 years ago

Hi @Chipzter ,

Been a while since we spoke :-)

Thanks for the debugging, I'm going to try and repro this at my side

jansenbe commented 4 years ago

Hi @Chipzter , @amarnath2510 ,

I'm prepping version 2.15 with a fix for this, if you want to test that version before I release it then please let me know.

Chipzter commented 4 years ago

Thanks @jansenbe! Absolutely! 🙂 Just let me know when it's available, and I can give it a go right away.

jansenbe commented 4 years ago

It's in the place where we shared the trace file

Chipzter commented 4 years ago

I noticed that SharePoint added QueryModifications to the search queries resulting in too many results. Explicitly setting the SourceID for the "Local SharePoint Results" result source seems to fix the issue. Additionally, unless this property is set EVERY time the query is executed, the problem reappears. Have a look at the PR and see if it makes sense. See also: https://www.eliostruyf.com/best-practice-local-sharepoint-results-source-id-gain-search-control/ Before: image After: image

Chipzter commented 4 years ago

Hi, Just a quick note to let you know that version 2.15 fixed this issue for me. I successfully scanned over 23,000 sites in about 22 hours with no need to break out of any infinity loop. Hopefully @amarnath2510's problem had the same root cause. Thanks!

amarnath2510 commented 4 years ago

Hi Robert,

Good Day!

I have started running the scanner for 21000 sites today. I will update once it is completed. Now output is generated for every one minute.

Thanks in Advance.

Regards, Amarnath Reddy C CS SharePoint Applications Support Information Systems & Technology Rio Tinto

T: +91 9900124945 amarnath.reddy@riotinto.commailto:amarnath.reddy@riotinto.com www.riotinto.comhttp://www.riotinto.com/

From: Robert Fridén [mailto:notifications@github.com] Sent: Monday, July 27, 2020 12:52 PM To: pnp/sp-dev-modernization sp-dev-modernization@noreply.github.com Cc: Reddy, Amarnath (IST) Amarnath.Reddy@riotinto.com; Mention mention@noreply.github.com Subject: [External] Re: [pnp/sp-dev-modernization] [BUG] : ModernizationScanner never finishes scanning large amounts of sites in the tenant and writes NO logfiles (#505)

Hi, Just a quick note to let you know that version 2.15 fixed this issue for me. I successfully scanned over 23,000 sites in about 22 hours with no need to break out of any infinity loop. Hopefully @amarnath2510https://github.com/amarnath2510's problem had the same root cause. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/pnp/sp-dev-modernization/issues/505#issuecomment-664167896, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQLGEQLCGHNVRAZDC27WFGDR5UTIBANCNFSM4PDBXYZQ.

jansenbe commented 4 years ago

Closing this issue now, version 2.15 will not hang anymore due to a search query. Delays due to throttling can still happen though