Open uy5cu71 opened 2 years ago
Wayback API had matchType option, example: https://web.archive.org/cdx/search/cdx?url=https://twitter.com/jack/statuses&matchType=prefix
Which returns:
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20121223123338 https://twicom,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20121223123338 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 5296 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130203195805 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1042 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130312144230 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1035 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130326132131 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 9317 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130402123359 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 1030tter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 5296 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130203195805 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1042 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130312144230 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1035 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130326132131 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 9317 com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130402123359 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 1030
Is it possible to download all of this urls? Because waybackpack will trim url based on cli input.
I have try to add new matchType parametr to the cdx file, i get valid response, but waybackpack still trim url based on cli input
Hi @uy5cu71, and thanks for your interest in this library. Unfortunately, I'm not sure I 100% understand your inquiry. But if it helps: waybackpack does not currently support the matchType parameter.
waybackpack
matchType
Wayback API had matchType option, example: https://web.archive.org/cdx/search/cdx?url=https://twitter.com/jack/statuses&matchType=prefix
Which returns:
Is it possible to download all of this urls? Because waybackpack will trim url based on cli input.
I have try to add new matchType parametr to the cdx file, i get valid response, but waybackpack still trim url based on cli input