Seedmanc / Booru-mass-uploader

This userscript allows you to mass-upload images to imageboard sites running *booru engines.
MIT License
34 stars 12 forks source link

add tags with sidecar tag text file? #17

Closed CuddleBear92 closed 6 years ago

CuddleBear92 commented 7 years ago

could we add tags on the upload with sidecar tag text files. one tag per line, same name as the file itself: filename.jpg will have a filename.jpg.txt next too it for example.

something like this allows people to upload and add tags more quickly and correctly on a much larger scale.

people could export from their hydrus collection with tags for example or get tags written like that many other ways aswell.

feel this is missing, adding tags manually or even just clicking is slow.

Seedmanc commented 7 years ago

I'm not familiar with that hydrus thing you mentioned. Are you suggesting that people would have to create a separate .txt. file for every image file they want to upload and select them all together with the images? Wouldn't it be better to have all tags for all images in a single text file instead, selected separately?

Besides, there is an option to parse the filenames themselves for tags for every image, so no need for workarounds with txt.

CuddleBear92 commented 7 years ago

actually no. having sidecar tag text files on per file allows the user to tag all images differently with different tags.

many downloaders and parsers tied to boorus and other sites does this and makes these.

as for Hydrus, its just a local booru client with tagging. also allowing exporting of the files with these (aswell as importing)

people dont manually make these text files. they never do.

but having them makes it easier to upload and share the tags to the file itself.

also, parsing is limited, user needs know how to use regexes. not to mention the filename limit on the OS itself when if you want to use tags like that.

ProximaNova commented 7 years ago

Yes, "the filename limit on the OS" is the biggest problem. For example, NTFS and FAT (used by Windows operating systems), has a max filename length of 260 characters. At my GitHub I wrote a program for downloading &id= web pages (with a .bat file) then parsing (with Vim) to another .bat file (which uses wget) that would download images with filenames in the format "[rating] [tags] [unique number]" to the C:/0/ folder where tags are sadly truncated to the 260 filename limit (loss of tags in the filenames but not in the previously downloaded files). I later uploaded said files (which lacked a more fullness of tags) to another booru. Various 3rd party software can be used to have very long filenames (also in 2016 Windows 10 lengthened the limit). Most filesystems (including ones used by Linux) have a max filename length of 255 basic ASCII characters. The only filesystem I saw that has a decent max filename length is Reiser4; Reiser4's maximum filename length is 3,976 bytes which equals 3,976 basic ASCII text characters. Some posts have 233 tags which in one case equated to 2731 characters; therefore, ideally, the max filename length should be 4096 characters (2^12) or more. 260 characters is only, in this post, from "Yes," to "then pars". So yes, sidecar file(s) should ideally be used to upload files with a more fullness of tags.

Links related to this post: https://en.wikipedia.org/wiki/Sidecar_file https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits https://www.gamefaqs.com/boards/922345-section-z/64202650 - "So I wrote a 4096 character long sentence... - Section-Z Message Board for Famicom Disk System - GameFAQs" https://chan.sankakucomplex.com/?tags=generaltagcount%3A201

ProximaNova commented 7 years ago

Update! - I was looking through https://chan.sankakucomplex.com/wiki/show?title=help%3A_advanced_search_guide and came across this: "[order:tagcount or] order:tagcount_asc = Order search results in ascending order based on number of tags." If you search this https://chan.sankakucomplex.com/?tags=order%3Atagcount the post with the most amount of tags is this one: https://chan.sankakucomplex.com/post/show/6018034 = 755 tags and 8982 characters. I just wanted to post this to correct what I previously said: max filename length should be at least 10,000 or 20,000 characters long. See http://textmechanic.com/text-tools/basic-text-tools/count-characters-words-lines/

CuddleBear92 commented 6 years ago

Aad to see this is still open. I guess i can expand on the brash and quick posts i did last time a year ago.

So yeah, DanbooruDownloader: https://github.com/Nandaka/DanbooruDownloader Grabber: https://github.com/Bionus/imgbrd-grabber and Hydrus: https://github.com/hydrusnetwork/hydrus All support saving sidecar text files when saving or exporting their files. All 3 of them grabs of-course tags from other boorus it might download from. Hydrus also allows manual tagging and shared tagging with the rest of the Hydrus users with a share mappings server (with 281+ million mappings of tags and files). Having the ability to import these into the mass uploader makes sense as you already have tags from multiple places you can use. Why waste time tagging everything manually on the upload when its already tagged neatly and correct other places? Not to mention individual tags on each file instead of each upload bulk... which only gets you so far as a creator tag and other simple tags that is shared across all files. Having the filename and folders to tag is also limited with the NTFS limits on at-least windows. but its also 10x messier and could be prone to errors.

These sidecar files are made for each file with the tags only for that file. Each line in the text sidecar is a tag in itself, no parsing issues. One thing needed would be to regex replace spaces with underlines as boorus use underlines instead of spaces. Sidecar files also allows the user to have whatever name of the file as they want. no need to have tags filling a long filename and making a mess.

At-least two of those programs also support namespaces in their tags, categories of the tags to keep things sorted. In anycase, it would allow for faster and accurate tags en-mass. No matter where they might have come from originally, may it be another booru or another source as a whole.

explorer_2018-08-03_15-29-57

Seedmanc commented 6 years ago

Alright, now I see what you mean. I tried implementing it, you can try it out.

I don't think JS would allow accessing files outside of the user selection, so you would have to select sidecar files together with images.

Seedmanc commented 6 years ago

So I'm assuming it works fine by now.