Slow indexing in Solr - Githubissues

buzzbangorg / bsbang-crawler

Alpha project for crawling bioschemas JSON-LD

Apache License 2.0

4 stars 5 forks source link

indexers.txt I calculated the time elapsed in indexing the documents for 20 json files from http://beta.synbiomine.org/synbiomine/sitemap.xml and I realized that the current way of indexing is very slow. It took me around 15 seconds to index 20 docs when we commit it one by one (We are reading a row in SQL and posting it in a for loop). If we rather collect the rows, convert then to json once and post the list of 20 at once, it will take only 0.7 seconds to do the same. A possible explanation for this could be the time taken to post a single query to the server and waiting for the response is 0.7 sec. When we do it for 20 docs, we are making 20 requests - 20*0.7 = 14 secs. @justinccdev Have you noticed this before?

Test code - indexers.txt

buzzbangorg / bsbang-crawler

Slow indexing in Solr #14