bibanon / BASC-Archiver

Python-based Imageboard (4chan) complete thread archiver.
https://pypi.python.org/pypi/BASC-Archiver/
135 stars 18 forks source link

Request: Thread title or some words from the OP post in the created folder #21

Open HASJ opened 8 years ago

HASJ commented 8 years ago

It is a nightmare going through the dozens of folders of archived threads without no way to differentiate between them.

A --title option would be enormously welcomed. Also, if you could make the archiver write to "/site/board/thread*/", for example "/4chan/a/745894 - dumb waaabus" would still work, even if renamed manually.

antonizoon commented 8 years ago

The folder structure is set up the way it is for a reason, so it can be parsed and hosted in an automated and standardized manner, the 4chan.arc standard.

I understand your point though, so here's something better: I will set up a thread list generator (something like 4chan's Catalog) that automatically produces static HTML thread lists sorted by board, and you can navigate to them by clicking links.

It will gather a table of all thread titles, a link to OP image, and a short snippet of the OP post. It will use sortable table columns, and of course you can then Ctrl-F.

DanielOaks commented 8 years ago

We can have something like a --use-slug argument that uses the thread's slug (that dumb waaabus text) in the created folder name. Would make browsing through folders later a bit nicer. Archive-archives would want to not use the --use-slug argument if they're hosting that folder directly, but archive-archives should also be grabbing WARC files and uploading them to Archive.org for integration into Wayback / hosting their own WARC playback software anyway (once that feature's completed), so it's not an issue.

That thread list generator can work with both sorts of folder titles as well, since we'll just point it towards the archive folder and it'll spider it or something like that.

HASJ commented 8 years ago

Such quick answers, I can see the development is very much on fire!

Wish I wasn't so lazy and started to learn how to code :smile: Since the design is pretty much covered I can only wish good luck!

antonizoon commented 8 years ago

Alright, in the 4chan.arc standard, we should specify that slugs are allowed in folder names and any parsers should ignore any text after the thread ID. 2015/10/01 18:15 "HASJ" notifications@github.com:

Such quick answers, I can see the development is very much on fire!

Wish I wasn't so lazy and started to learn how to code [image: :smile:] Since the design is pretty much covered I can only wish good luck!

— Reply to this email directly or view it on GitHub https://github.com/bibanon/BASC-Archiver/issues/21#issuecomment-144864379 .