Open GoogleCodeExporter opened 8 years ago
http://groups.google.com/group/solrnet/msg/e05ac9b473e0de2d
On indexing multiple files per request: http://www.mail-archive.com/solr-
user@lucene.apache.org/msg33954.html
Original comment by mauricio...@gmail.com
on 26 Mar 2010 at 1:58
[deleted comment]
Added basic support for the Solr ExtractingRequestHandler extension. Since I
have no experience with .NET the code will need some extra work from someone
else.
Usage:
Stream f = File.OpenRead("c:\\example.pdf");
solr.AddFile(f, new Dictionary<string, string> {{ "literal.id", "id1234" }});
f.Close();
solr.Commit();
Original comment by mrandres...@gmail.com
on 29 Jun 2010 at 3:45
Attachments:
Thanks! I'll review it when I get some time.
Original comment by mauricio...@gmail.com
on 29 Jun 2010 at 4:05
Ok, I reviewed the patch, it's a good start, but here are the issues I found:
* No tests
* Only works with FileStreams (should work with any Stream)
* Uses a buffer the size of the file - big files would use lots of memory
* Depends on Windows association of file extension to find out content-type: I'm not sure how reliable this is. For example, does a bare-bones Windows installation know about application/pdf? Is setting the correct content-type required? Getting the content-type of a generic Stream could be difficult.
* Some code duplication between Post() and PostBinary() - some refactor needed there.
I applied the patch in a new branch:
http://github.com/mausch/SolrNet/tree/ExtractingRequestHandler
Original comment by mauricio...@gmail.com
on 3 Jul 2010 at 6:38
Original comment by mauricio...@gmail.com
on 3 Jul 2010 at 6:38
It seems that Solr does *not* support multiple files in a single request:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg33997.html
Original comment by mauricio...@gmail.com
on 11 Dec 2010 at 2:44
I'm currently implementing the ExtractingRequestHandler in my SolrNet fork I
hope to sort out the issues raised by mauricio about the earlier patch. If
anyone have an idea on how they would like it to work please let me know. I
will try to get some unit testing done but I'm not used to writing tests so may
need some help.
Original comment by nazmul...@gmail.com
on 8 Feb 2011 at 11:14
Feel free to post any questions in the google group.
Original comment by mauricio...@gmail.com
on 9 Feb 2011 at 1:11
So Whats the status of the ExtractingRequestHandler in SolrNet?
Original comment by jeroen.g...@gmail.com
on 18 Feb 2011 at 9:23
Status update:
http://groups.google.com/group/solrnet/browse_thread/thread/8babf22c83e59aa1
Original comment by mauricio...@gmail.com
on 18 Feb 2011 at 6:37
Merged with master in 80beaac9cf608ed37b67741c1be2deffcfea9551
Added an integration test in 9c7523dc9a767694d2d3b181c9a85e67807cc9ad , it
could use some more integration tests.
Original comment by mauricio...@gmail.com
on 23 Feb 2011 at 5:53
Is the ID really required in ExtractParameters? The ID value could also be
provided through fmap.
Original comment by mauricio...@gmail.com
on 30 Apr 2011 at 7:57
Answering to #13 :
http://wiki.apache.org/solr/ExtractingRequestHandler#Getting_Started_with_the_So
lr_Example says : "the literal.id=doc1 param provides the necessary unique id
for the document being indexed"
Original comment by mauricio...@gmail.com
on 8 Dec 2011 at 1:33
The handler requires an ID field, which is a good idea to have anyway, but it
is hard-coded to lowercase "id" in the ExtractCommand. I have a a fix for this
in my fork. Haven't done much testing around this yet though...
Original comment by gmpig...@gmail.com
on 4 Apr 2012 at 10:15
Moved to https://github.com/mausch/SolrNet/issues/87
Original comment by mauricio...@gmail.com
on 8 Sep 2013 at 4:22
Original issue reported on code.google.com by
mauricio...@gmail.com
on 5 Oct 2009 at 8:39