chrismattmann / etllib

This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
16 stars 35 forks source link

Solr url for poster #30

Closed smritish closed 9 years ago

smritish commented 9 years ago

poster -u "http://localhost:8983/solr/#/" gives 405 error. Other urls I tried do not work either:

POST http://localhost:8983/solr/#/~cores/collection1/update HTTP error(HTTP Error 405: HTTP method POST is not supported by this URL)

POST http://localhost:8983/solr/update/ HTTP error(HTTP Error 400: Bad Request)

What is the correct url for Solr to be used with poster?

chrismattmann commented 9 years ago

Try: http://localhost:8983/solr/collection1/update

smritish commented 9 years ago

POST http://localhost:8983/solr/collection1/update HTTP error(HTTP Error 400: Bad Request)

chrismattmann commented 9 years ago

@smritish please give me the full path to your Solr query interface? What's the full path where you can actually issue queries (on the Solr admin if you click on the query URL that it generates, what does it generate when you get results?)

smritish commented 9 years ago

I can query and see the entire schema.xml using the below url: http://localhost:8983/solr/collection1/schema?wt=json

Other queries I can do: http://localhost:8983/solr/collection1/schema/dynamicfields?wt=json http://localhost:8983/solr/collection1/schema/fields?wt=json

I tried: http://localhost:8983/solr/collection1/schema/update with poster but this as well does not work.

chrismattmann commented 9 years ago

see this page: https://wiki.apache.org/solr/UpdateXmlMessages

Does: http://localhost:8983/solr/update work?

smritish commented 9 years ago

Using http://localhost:8983/solr/update gives below:

C:\Python27\Lib\site-packages\etl>poster -v -u http://localhost:8983/solr/update C:\Users\sushar\Downloads\new2\000003dc-56b5-4b32-8a24-124a181f7ef9.json ^Z Processing: C:\Users\sushar\Downloads\new2\000003dc-56b5-4b32-8a24-124a181f7ef9. json {"add": {"doc": {"A": "2013-1-10", "C": "Buenos Aires", "B": "Capital", "E": "", "D": "Repositores", "G": "inmediato", "F": "$ 5000 mas viaticos", "I": "", "H": "a largo plazo", "K": "Alarcorp consultora", "J": "", "M": "-27.8090096", "L": "buenos aires, santiago del estero, argentina", "O": "2013-01-11", "N": "-64.236 8469", "Q": "2013-03-05", "P": "http://www.computrabajo.com.ar/bt-ofrd-alarcorp6 9-21444.htm", "id": "000003dc-56b5-4b32-8a24-124a181f7ef9"}, "boost": 1.0}} POST http://localhost:8983/solr/update HTTP error(HTTP Error 400: Bad Request)

SantoshShankars commented 9 years ago

Hi @smritish : Even I was facing the "HTTP Error 400: Bad Request" . So there were two things which I did to resolve it

  1. The solr update query can accept json arrays as inputs whereas etllib only accepts a single json at a time. So removing the array('[') in the json might help.
  2. the update url which I used was http://localhost:8983/solr/update/json (I am not sure if this has any major difference).
chrismattmann commented 9 years ago

Hey @SantoshShankars great help - yes you have to use the JSON update URL you are totally correct. Also yes, poster accepts a single JSON document at a time, and not an array. You shouldn't have an array in front. That should fix it! @smritish please confirm and I will wrap this one up.

smritish commented 9 years ago

Thanks all, I was getting the error because of undefined fields. I had to add the fields in the schema.

chrismattmann commented 9 years ago

Thanks, marking this one as resolved.