Open gingerbeardman opened 1 year ago
OK, I figured it out and support seems to be missing, so I will rename the issue.
ia search 'hanafuda' --parameters rows:10 --field addeddate --sort "addeddate desc"
But...
ia search 'hanafuda' --fts --parameters rows:10 --field addeddate --sort "addeddate desc"
I am using:
pip install internetarchive
The confusion here is that ia search
uses various endpoints depending on several things. It uses the Scrape API by default, Advanced Search when either rows
or page
parameters are specified, and our beta FTS API when either --fts
or --dsl-fts
are specified.
The reasoning behind this is because the Advanced Search API is not designed for scraping/retrieving full result sets (it's capable of doing so, but it's not designed for it). The Scrape API is designed for dumping full result sets. I assume that most people want full result sets when using ia search
, and that's why the Scrape API is the default. When a user specifies that they only want a subset of the results (i.e. via page
or rows
params), then Advanced Search is used.
Then there's the FTS API. This is in beta, is not currently documented publicly, and is subject to change. The specific parameter you're after though is size
as opposed to rows
:
» ia search 'hanafuda' --fts --parameters size:10 | wc -l
10
--fields
is not currently supported with --fts
, all indexed fields are returned by default. addeddate is not returned, but publicdate is (under .fields.meta_publicdate
). Sorting is not supported in the beta FTS API at this time.
Sorry for the confusion. We hope to consolidate these endpoints in the future!
Thanks @jjjake very informative. I'll keep an eye on progress.
It seems very wasteful to query the whole set when I only want the most X recent (for example any new items since the last time I did the query). But maybe I'm overthinking it!? I prefer to keep things lean and save time and electricity on this earth.
The "beta FTS API" doesn't seem to point to the right endpoint. results from "ia search" are not the same as the one used by https://archive.org/search?query=... JS from this page uses https://archive.org/services/search/beta/page_production/, which return cleaner results.
Is there any plan to switch to that endpoint?
@chgans be-api.us.archive.org/ia-pub-fts-api
is the current recommendation from the developers of our FTS beta API. We do hope to consolidate our search endpoints in the future though. Thanks for checking!
Hi,
I am doing
ia search --parameters="..."
...but I do not know what parameters it accepts.
Is there a list or documentation anywhere?
My goal is to return a small number of results sorted by most recently "added" first.
sort=-publicdate
sort createdate desc
sort_by=-addeddate
But those do not seem to work with
ia search
, or maybe I am doing it wrong?I have also tried
ia search --parameters="rows=10" --sort="addeddate desc" "hanafuda"
ia search --parameters="rows:10" --sort="created_on desc" "hanafuda"
Any help appreciated.
Thanks!