Open igobranco opened 4 years ago
The fl param isn't filtering any CDXJ API entries.
Using 2.4.0-rc5 version can't filter the output of the CDXJ API.
Currently getting https://m.preprod.arquivo.pt/wayback/cdx?output=json&url=fccn.pt&fl=timestamp&limit=3
Returns:
{"urlkey": "pt,fccn)/", "timestamp": "19961013145650", "status": "200", "url": "http://www.fccn.pt/", "filename": "AWP-Roteiro-20090510220155-00000.arc.gz", "length": "0", "mime": "text/html", "offset": "45198", "digest": "OWMAVER7CCNJWL2E5ZURDDKGCHWS7JJO", "source": "$root:gigantic_index_1_v2.cdxj", "source-coll": "$root"} {"urlkey": "pt,fccn)/", "timestamp": "19971210202137", "status": "200", "url": "http://www.fccn.pt/", "filename": "PT-HISTORICAL-1997-GROUP-ABP-20100830000000-00000.arc.gz", "length": "0", "mime": "text/html", "offset": "11878742", "digest": "ZDBF3G73EW3UK6GIWTLDCIDKCAPCBFJ2", "source": "$root:gigantic_index_1_v2.cdxj", "source-coll": "$root"} {"urlkey": "pt,fccn)/", "timestamp": "19971210202137", "url": "http://www.fccn.pt:80/", "mime": "text/html", "status": "200", "digest": "ZDBF3G73EW3UK6GIWTLDCIDKCAPCBFJ2", "length": "1084", "offset": "11878742", "filename": "PT-HISTORICAL-1997-GROUP-ABP-20100830000000-00000.arc.gz", "source": "$root:IA.cdxj", "source-coll": "$root"}
Where the expected result should be:
{"timestamp": "19961013145650"} {"timestamp": "19971210202137"} {"timestamp": "19971210202137"}
After reviewing the code I've detected it works with 'fields' parameter instead of 'fl'.
https://github.com/webrecorder/pywb/blob/92e459bda52a2b03f33a4b0b8094ed424248d2a5/pywb/warcserver/index/query.py#L87
https://github.com/webrecorder/pywb/blob/92e459bda52a2b03f33a4b0b8094ed424248d2a5/pywb/warcserver/index/cdxops.py#L46
Nevertheless the documentation needs a review: cdxserver_api
Describe the bug
The fl param isn't filtering any CDXJ API entries.
Steps to reproduce the bug
Using 2.4.0-rc5 version can't filter the output of the CDXJ API.
Expected behavior
Currently getting https://m.preprod.arquivo.pt/wayback/cdx?output=json&url=fccn.pt&fl=timestamp&limit=3
Returns:
Where the expected result should be: