Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/
Other
721
stars
270
forks
source link
Cursor disconnect RemoteSolrException Unable to parse 'cursorMark' after totem: value must either be '*' or the 'nextCursorMark' returned by a previous search #490
cursor=$(jq '.message["next-cursor"]' $target | tr -d '"')
startindex=$(jq '.message.query["start-index"]' $target)
perpage=$(jq '.message["items-per-page"]' $target)
index=$[$index+1]
if [ "$status" == "ok" ]
then
totalRows=$[$totalRows+$rows]
else
# force while exit
totalRows=1
total=0
# remove invalid
mv $target $target.err
fi
echo "status: $status index: $index $totalRows of $total startindex: $startindex perpage=$perpage cursor:$cursor"
if [ $totalRows -lt $total ]
then
# wait a bit
sleep 2
downloadWithCursor $rows $index "$cursor"
fi
done
cat $sampledir/crossref-*.json | jq .message.items[].title | cut -f2 -d'[' | cut -f2 -d'"' | grep -v "]" | tr -s '\n' > $sampledir/proceedings-crossref.txt
}
`
I run into a similar issue:
{
"status": "error",
"message-type": "exception",
"message-version": "1.0.0",
"message": {
"name": "class org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException",
"description": "org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http:\/\/mds3:8984\/solr\/crmds1: Unable to parse 'cursorMark' after totem: value must either be '*' or the 'nextCursorMark' returned by a previous search: AoJ4 NDNi\/ECPwJodHRwOi8vZHguZG9pLm9yZy8xMC4xNzc1OC9laXJhaTU=",
"message": "Error from server at http:\/\/mds3:8984\/solr\/crmds1: Unable to parse 'cursorMark' after totem: value must either be '*' or the 'nextCursorMark' returned by a previous search: AoJ4 NDNi\/ECPwJodHRwOi8vZHguZG9pLm9yZy8xMC4xNzc1OC9laXJhaTU=",
jq . *.err | grep "search:" | cut -f7 -d:
gives me:
value must either be '*' or the 'nextCursorMark' returned by a previous search
AoJ7o 7Hk/ECPwJodHRwOi8vZHguZG9pLm9yZy8xMC4xMTQ1LzI1MzA1NDQ=",
value must either be '*' or the 'nextCursorMark' returned by a previous search
AoJ3pL 1svECPwJodHRwOi8vZHguZG9pLm9yZy8xMC4xMTQ1LzExMzg5NTM=",
value must either be '*' or the 'nextCursorMark' returned by a previous search
AoJ teyWtfECPwJodHRwOi8vZHguZG9pLm9yZy8xMC4zMTE1LzEyMjU3MzM=",
value must either be '*' or the 'nextCursorMark' returned by a previous search
AoJx6NyU0 8CPwhodHRwOi8vZHguZG9pLm9yZy8xMC4xMDYxLzk3ODA3ODQ0ODEwMTE="
so i suspect the space in the token is the issue.
Please update the documentation of what kind of encoding you expect or better fix the upstream library to use tokens that need no encoding (do not use spaces). Also improving the error message and point to the FAQ would be helpful.
To close this issue please let me know whether my space assumption is right and replacing space with "+" will fix the problem.
427 already points to an issue with cursors. With my script:
`#
download from crossref RESTful API via cursor
# downloadWithCursor() { local l_rows="$1" local l_index="$2" local l_cursor="$3" target=$sampledir/crossref-$l_index.json src="https://api.crossref.org/types/proceedings/works?select=event,title,DOI&rows=$l_rows&cursor=$l_cursor" download $src $target }
#
get Crossref data
see also https://github.com/TIBHannover/confIDent-dataScraping
# getCrossRef() { rows=1000 index=1 totalRows=0
force while entry
total=$rows downloadWithCursor $rows $index "*" while [ $totalRows -lt $total ] do target=$sampledir/crossref-$index.json status=$(jq '.status' $target | tr -d '"') total=$(jq '.message["total-results"]' $target)
get and remove quotes from cursor
done cat $sampledir/crossref-*.json | jq .message.items[].title | cut -f2 -d'[' | cut -f2 -d'"' | grep -v "]" | tr -s '\n' > $sampledir/proceedings-crossref.txt } ` I run into a similar issue:
jq . *.err | grep "search:" | cut -f7 -d:
gives me: value must either be '*' or the 'nextCursorMark' returned by a previous search
so i suspect the space in the token is the issue.
Please update the documentation of what kind of encoding you expect or better fix the upstream library to use tokens that need no encoding (do not use spaces). Also improving the error message and point to the FAQ would be helpful.
To close this issue please let me know whether my space assumption is right and replacing space with "+" will fix the problem.