Closed SolSearch closed 6 years ago
I have placed http://localhost:8983/solr/gettingstarted_shard1_replica2/ under the committer tag
If you are using Solr Cloud, you should reference your collection. Based on the URL sample you provided, it probably is http://localhost:8983/solr/gettingstarted
. See if that makes a difference. If not, check the HTTP Collector logs and the Solr logs for potential errors.
I have tried with http://localhost:8983/solr/gettingstarted but still don't see any documents in the collection.
I know it is more of a general solr question but I want to make sure I am doing it right. Perhaps I don't need to have multiple shards. I want to be able to search the contents of website 1 OR website 2 OR (website 1 AND website 2).
With a collection with two shards, how do I know which shard to search if I only want to search website 1 and not website 2, for example. As I understand website 1 contents could be in either one of the two shards. I am wondering if the better approach is to index the website 1 and website 2 documents (about 1 million documents) indexed in one core and retrieve the documents from the two sites using the fq parameter, e.g., fq=webtype:web1.
With Solr Cloud, you reference "collections", which can be spread across one or several shards. That should be transparent to you when you query. If you want to isolate each, it is probably easier to simply create a different Solr collection for each or, add a field to your existing collection that tells you what the source of the document is (you can filter on that). You can use a ConstantTagger
from the Importer module to help with that.
What query do you issue to find documents? And can you confirm you have no errors in the HTTP Collector and Solr logs? Please attach them if you can.
If I create separate Solr collections then how can I specify to include collection 1 and not collection 2? This is why I am probably better off to include the documents in one collection and then filter the query by a field that tells the source of the document, as you also suggested. In any case, I don't see any errors in the attached logs but for now if I use the approach of filtering by the source, the issue is not relevant.
Are you still having issues committing documents or can we close?
I am able to commit documents when I create a core. You can close the case, thanks. On Tuesday, January 16, 2018, 9:25:48 PM EST, Pascal Essiembre notifications@github.com wrote:
Are you still having issues committing documents or can we close?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Hello,
I have downloaded the http collector and it works great with a core. I have a requirement to search more than one web site depending upon user's selection. I understand I need to create multiple shards and then include those shards in my query. I have started Solr in cloud mode and followed the script to create shards. How can I use Norconex Http collector to index web site documents? In the Norconex config file I have specified
but I don't see any documents indexed under gettingstarted_shard1_replica2.
I am using the same schema and solrconfig xml files as the one I used where I am able to index documents in a core.