wabarc / wayback

An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services including Internet Archive, archive.today, Ghostarchive, IPFS, Telegraph, and file systems.
https://docs.wabarc.eu.org
GNU General Public License v3.0
1.82k stars 65 forks source link

Bridging ArchiveBox #388

Open waybackarchiver opened 1 year ago

waybackarchiver commented 1 year ago

ArchiveBox is a powerful open-source web archiving tool that supports a wide range of archive formats, includes an administrative backend, and offers various other features. It performs well in archiving tasks, and Wayback aims to build upon it to provide even richer functionality.

Our plan is to utilize ArchiveBox as an additional backend for requesting archiving through an REST API. Unfortunately, ArchiveBox's API plan has not been implemented yet. To accomplish this particular feature, we will initially implement it by simulating web requests during the early stage. Once ArchiveBox's API plan is completed, we will further migrate to a REST API in a later stage.

pirate commented 1 year ago

Excited to see this! It adds some logs to the fire under my butt to get that REST API pushed out 😆

In the meantime you can go one step further than just mocking out API requests as no-ops. If you're willing to have your REST API stubs call Python API functions imported from archivebox/main.py, then it will work just like it would with the API as long as the archivebox instance is on the same local machine. Then we can swap those local function calls with REST requests without too much difficulty. My plan for the API is to basically expose endpoints to call each main.py function + FastAPI/DRF GET/POST/PATCH/etc. on the models, so it will be almost exactly the same API (each kwarg becomes a POST parameter, stdout instead returned as structured json, etc.).

waybackarchiver commented 1 year ago

Thanks for the hints, it opened my mind and gave me new ideas!

hugo-akaora commented 2 months ago

Hello, seems to progress : https://github.com/ArchiveBox/ArchiveBox/issues/496#issuecomment-2080174235