Nandaka / DanbooruDownloader

*booru image downloader
http://nandaka.devnull.zone/
390 stars 38 forks source link

how big is tags.xml ? #104

Closed maskkulin closed 7 years ago

maskkulin commented 7 years ago

I have already over 11k .xml files and I'm not even sure if what I do is right. Should I've ticked merge or do they merge automatically when the download is complete?

wopian commented 7 years ago

If you ticked merge they'll all be merged into a single XML file once the tag download process has finished.

maskkulin commented 7 years ago

That answered just one of my questions. I am now on 12k .xml files, it's almost 500 MB big. How big is your xml file? I am just curious how much longer it takes till the download is finished. It's rather annoying because I'm getting 429 error every few minutes, hence I can't let it download over night

Nandaka commented 7 years ago

I assume you are trying to update the tags xml from sankaku? It is not required anymore for that site, as it doesn't allow direct download of tags.xml using API (I need to parse each page in tags, and it has 15127 pages).

What I do is by parsing tags in the image post specifically for sankaku based on it css class. It has some limitation that the actual tag type cannot be determined correctly until the detail post page is loaded (usually during image download), else it will show based on current tags.xml data.

Before add to download (tags from danbooru's tags.xml) - Main Tab image

After add to download (parsed based on sankaku image post detail) - Download Tab image

Ensure Use Global Tags.xml is disabled in Settings Tab. image

maskkulin commented 7 years ago

Does this also work with batch mode?

Nandaka commented 7 years ago

Should be works in batch mode. Please note, the image post tags parsing is only for sankaku.

maskkulin commented 7 years ago

wow so I downloaded 13k tags for no reason. You should really include this in your readme.txt But now everytime I open danbooru downloader it says no tags.xml and wants to download it

Nandaka commented 7 years ago

it is still required for other booru 😄

sankaku is a bit 'special'

maskkulin commented 7 years ago

it's not working for me, pls help

Query: chakku!_tsuiteru!! System.IO.FileNotFoundException: Cannot load tags.xml Dateiname: "tags.xml" bei DanbooruDownloader3.DAO.DanbooruTagsDao..ctor(String xmlTagFile) bei DanbooruDownloader3.DAO.DanbooruTagsDao.get_Instance() bei DanbooruDownloader3.Entity.DanbooruProvider.LoadProviderTagCollection() bei DanbooruDownloader3.Helper.ParseTags(String p, DanbooruProvider provider) bei DanbooruDownloader3.Engine.SankakuComplexParser.Parse(String data, DanbooruSearchParam searchParam) bei DanbooruDownloader3.DAO.DanbooruPostDao..ctor(Stream input, DanbooruPostDaoOption option) bei DanbooruDownloader3.FormMain.GetBatchImageList(String url, String query, DanbooruBatchJob job) bei DanbooruDownloader3.FormMain.DoBatchJob(BindingList1 batchJob) 2017-01-02 20:36:01,997 INFO - [DoBatchJob] Downloading list: https://chan.sankakucomplex.com/post/index?tags=chakku!_tsuiteru!!+rating:explicit&page=2 2017-01-02 20:36:03,401 INFO - [DoBatchJob] No more image. 2017-01-02 20:36:03,433 INFO - [DoBatchJob] Batch Job #0: Done `