Closed alexcg1 closed 4 years ago
Before cookiecutter was broken, I got output like https://github.com/jina-ai/examples/tree/master/my-first-jina-app#curl
It seems like the result is analogous to what we got before cookie cutter was broken, but the score has an opName and refId instead of the opName and the score.
Moreover, this outputs a more exhaustive fields.
Hmmm...
I thought it was just returning junk data, but it seems to be still indexing the default data (['abc', 'cde', 'efg']
) rather than the file defined in env variable DATA_PATH
(which I've set to data/startrek_tng.csv
)
Hang on, I'll try setting DATA_PATH
to gibberish to see if it errors out
Right. Even if I set DATA_PATH to gibberish:
rm -rf workspace
DATA_PATH
to dfneedbe
python app.py index
(runs fine)python app.py search
(runs fine)curl --request POST -d '{"top_k": 10, "mode": "search", "data": ["text:picard to riker"]}' -H 'Content-Type: application/json' 'http://0.0.0.0:65481/api/search'
it still returns similar results.
Hang on. Additional update. When actually using curl
, python app.py search
throws errors like chunk_idx@295967[E]:'NoneType' object has no attribute 'shape'
.
I have a feeling a few things are happening:
DATA_PATH
It's just returning values like abc
which are in the lines
variable on line 30 of app.py
numpy
related errorsLike:
chunk_idx@295430[W]:you can not query from NumpyIndexer as its "query_handler" is not set. If you are indexing data from scratch then it is fine. If you are querying data then the i
ndex file must be empty or broken.
chunk_idx@295430[E]:'NoneType' object has no attribute 'shape'
Which makes me think it's trying to use numpy
index_type
, not strings
index_type
That's just off the top of my head though.
I'm getting all kinds of weird stuff here. You should replicate for yourself in case I'm doing something wrong.
app.py
directly, create environment variables like MAX_DOCS
, DATA_PATH
as defined in app.py
I'm hoping it's just me being tired. Or tech issues my end
python app.py .....
using MAX_DOCS=500
for examplecurl
from a different Tmux pane. But since it's only connecting to the jina port and not dealing with env variables I don't think that should be a problempython app.py index
is actually indexing my data source rather than default values? Like by digging into workspace
folder?OK. Think I might've found the issue. I should've run export DATA_PATH='data/startrek_tng.csv
. Missing export
caused it do go to default behavior. Setting DATA_PATH
to gibberish crashed it when using export
My bad!
It's now indexing properly AFAICS
Hi @alexcg1 , this is great. Could you also please paste the output log of the query now that the data path is exported and recognized and jina[http] is installed? Does it output the score as expected?
We could include the export
in the ReadMe.md
Also, this log is more exhaustive than the previous one because of the fact that level_depth
has been introduced due to recursive structure of Document, so it's not broken.
This has been fixed and both indexing and querying work fine. The recursive structure of the document gives more exhaustive/ verbose logs
Problem After running
pip install jina[http]
,python app.py search
seems to work without crashing. However, querying via curl has issues:Input:
curl --request POST -d '{"top_k": 10, "mode": "search", "data": ["text:hey, dude"]}' -H 'Content-Type: application/json' 'http://0.0.0.0:65481/api/search'
Output:
Environment
Environment variables:
MAX_DOCS
:500
DATA_PATH
:data/startrek_tng.csv