Closed p6002 closed 10 months ago
Can you share the output of archivebox --version
and your ./data/logs/*.log
log files around the time when you're clicking archive from the extension?
Version: https://pastebin.com/MtNx0p56 Error.log (only file in log directory) https://pastebin.com/y17RWHp6
Can you try with the dev version instead of master? set it to archivebox/archivebox:dev
and then do docker-compose pull
.
No changes: https://pastebin.com/9EvFhspY
can you try setting archivebox config --set SAVE_MERCURY=False
and try again, seems there might be an issue with the mercury article text extractor.
Still nothing, here is output:
`root@f7704c11dd33:/data# su archivebox $ archivebox config --set SAVE_MERCURY=False find: '/.config/chromium/Crash Reports/pending/': No such file or directory [i] [2023-02-21 19:23:08] ArchiveBox v0.6.3: archivebox config --set SAVE_MERCURY=False
/data
find: '/.config/chromium/Crash Reports/pending/': No such file or directory find: '/.config/chromium/Crash Reports/pending/': No such file or directory find: '/.config/chromium/Crash Reports/pending/': No such file or directory SAVE_MERCURY=False
[i] Note: This change also affected these other options that depended on it: USE_MERCURY=False $ `
Can you screenshot the extension config options from the extension popup in your browser? Specifically want to see how you configured the server URL in the end (I know you tried all the options specified in your original post). I don't have any good solution/ideas yet so grasping at straws a bit, but maybe theres something weird I can see in a screenshot.
For reference it should be http://archive.mydomain.com
, http://archive.mydomain.com:port
, or https://archive.mydomain.com
like so:
Thanks
Here is screenshot: https://i.ibb.co/2NLWJpD/2023-02-25-002708.png
Of course, instead of domain is my domain. I can still manually add the page in the app. This is what the compose file looks like from where I run the container: https://pastebin.com/QdCqDwFw
Can you try with 8000 just to test temporarily instead of 8505, maybe it's a port mapping issue? I've seen issues with non-default ports in the past.
Or set all the ports to 8505 like so:
command: server --quick-init 0.0.0.0:8505
ports:
- 8505:8505
I changed to 1209 and still the same thing. I have the firewall on the Synology turned off. Overall I have about 40 containers on different configurations and everything works. Archivebox also works in the browser, only the plugin does not connect.
I'm using the Edge browser on Windows, but I also tested in Firefox on Ubuntu and there is the same problem.
When I just add https://mydomain.com/ the task number on the addon icon just blinks.
If I add https://mydomain.com:1209/ then the number 1 on the icon lights up for a few seconds, but it doesn't change anything.
Yesterday I was still using Nginx proxy manager, but I changed to Synology's built-in reverse proxy and it didn't help at all.
Maybe some other browser plugin or setting is blocking this connection?
-
Now I have observed what happens in the log when trying to use addon.
When I use Firefox in a browser where I am not logged into archivebox, after clicking "Archive current page" it shows:
"POST /add/ HTTP/1.1" 302 0
"GET /accounts/login/?next=/add/ HTTP/1.1" 302 0
"GET /admin/login/ HTTP/1.1" 200 11143
When in Egde where I am logged in it shows:
"POST /add/ HTTP/1.1" 200 7049
Nevertheless, nothing is added to the page.
Thats a great sign, it's getting the /add/
submission at least. Are you sure you're adding unique URL's that arent already archived? ArchiveBox only archives URLs once, it doesn't re-snapshot if you already have the URL.
Do you have some working docker-compose file for this project?
The default one in the repo works.
Works good, but not work with browser addon.
I tested on firefox and the extension works fine. The problem exists only on Edge.
What do you need to help you fix it?
here it doesnt work.
Server: try 1. docker bridge + port 8000 -> 8040 try 2. docker host + port 8000 archivebox/archivebox:dev errorlog: https://pastebin.com/raw/XutpN9Kf
Firefox: add base url and "add current domain to list" but nothing happens can connect both tries normally with firefox
This project hasn't been moved for 2 years, which is probably why it is no longer supported by browsers.
oh, thank you for info
@p6002 The extension hasn't had a major release because I have an ArchiveBox refactor that's been slow-moving and touches a lot of pieces and will add a new REST API. The extension developer is likely waiting for that new API to land before continuing work on this extension.
In the meantime ArchiveBox development and bugfixes have been ongoing in the dev
branch. AFAIK the dev
branch works with this extension in browsers for some people, I am using it right now without issues, but because I don't know much about the extension, I'm not exactly sure what might be breaking for the other people reporting issues in this thread.
This does not seem to work for me either on Brave. Can anyone please confirm this is supposed to be working or not with the latest dev branch of archive box? Do I have to be setting up something special in docker-compose.yml to make it work?
I can confirm it's been working for a while and is currently working for me.
Can you post your docker-compose.yml and the full output of docker compose run archivebox version
.
@pirate Thanks for the reply.
docker compose run archivebox version
chown: cannot access '/browsers/*': No such file or directory
0.7.1+editable
ArchiveBox v0.7.1+editable Cpython Linux Linux-5.14.0-4-amd64-x86_64-with-glibc2.36 x86_64
DEBUG=False IN_DOCKER=True IN_QEMU=False IS_TTY=True TZ=UTC FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644 SEARCH_BACKEND=ripgrep
When I add a domain then add the page via the brave addon this is what I get in the running docker terminal
archivebox_1 | "POST /add/ HTTP/1.1" 302 0
archivebox_1 | "GET /accounts/login/?next=/add/ HTTP/1.1" 302 0
archivebox_1 | "GET /admin/login/ HTTP/1.1" 200 11143
archivebox_1 | "POST /add/ HTTP/1.1" 302 0
archivebox_1 | "GET /accounts/login/?next=/add/ HTTP/1.1" 302 0
archivebox_1 | "GET /admin/login/ HTTP/1.1" 200 11143
Did you set docker compose run archivebox config --set PUBLIC_ADD_VIEW=True
?
It's required to allow the extension to submit URLs without authenticating.
I did not do that, the extension page did not mention it. I just ran that command and restarted the container. I still have the same issue.
I can see it posts something but nothing happens on the server. I can add urls manually in the server's own page but that is so much friction to have it open and accessible all the time.
archivebox_1 | "POST /add/ HTTP/1.1" 302 0
archivebox_1 | "GET /accounts/login/?next=/add/ HTTP/1.1" 302 0
archivebox_1 | "GET /admin/login/ HTTP/1.1" 200 11143
Is the extension using a REST API or doing some special server talk? I will try to look into it, although I am not a web dev. I can maybe make it work.
Can you post the full verbatim output of:
docker compose pull
docker compose run archivebox version
docker compose run archivebox config --set PUBLIC_ADD_VIEW=True
docker compose run archivebox config
docker compose down
docker compose down # yes, really, run it twice
docker compose up
docker compose logs
(please don't redact anything, just copy paste the exact commands you typed in and the full output as they appear in your terminal)
Edit: fixed typo PUBLIC_ADD_PAGE
-> PUBLIC_ADD_VIEW
isn't the parameter PUBLIC_ADD_VIEW=True ?
BTW: i might have found something. As long as there are no entries in the domain area OR (that i'm not sure until now), no regex entries OR wrong regex entries, then the extension is reporting and archivebox is saving entries
Ah right sorry, I misremembered. (edited to fix it above)
Here it is in the docs about the extension for future reference: https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#browser-extension-usage.
If anyone still needs help, please open a separate issue! 😁
I tried adding to the Edge browser extension:
With and without http/https - no success. I can't manually or automatically add pages to download.
I activated the cli option to use without login as it says in the documentation.
When I add the page manually through the site, it downloads correctly. The problem is only with the browser addon.