Open ocrhell opened 2 months ago
Have you tried it with a "metadata"
post-processor?
https://gdl-org.github.io/docs/configuration.html#postprocessor-configuration https://gdl-org.github.io/docs/configuration.html#postprocessor-options
For example
{
"extractor":
{
"booru":
{
"..": "..",
"postprocessors":[
{
"name" : "metadata",
"event": "post",
"mode" : "custom",
"skip": true,
"content-format": "{content|description}\n",
"filename": "{id}.txt"
}
]
}
}
}
Of course, you need to check the output with -K
, if it's actually {content}
you want, or {description}
, or whatever the name is for translation - given that the site provides something like such translations.
Closed it prematurely, sorry.
Going through the notes
block from gelbooru.py
and gelbooru_v02.py
, is it possible to filter out height, width, x, y?
notes.append({
"width" : int(extr(note, 'data-width="', '"')[0]),
"height": int(extr(note, 'data-height="', '"')[0]),
"x" : int(extr(note, 'data-x="', '"')[0]),
"y" : int(extr(note, 'data-y="', '"')[0]),
"body" : extr(note, 'data-body="', '"')[0],
})
I've tried #
but that doesn't work.
Tried multiple variations of notes.x / notes.width
etc... with an additional postprocessors
instance with delete
before and after. Also didn't work.
-K
gives notes[N]['width'] / notes[N]['height']
etc... and I've tried those too.
Is there a way to extract notes (translations) and put them in a similarly named downloaded text file? Specifically gelbooru. When running an instance with this in config file:
Only the image is downloaded and the notes aren't extracted at all. Not even in cmd. Should I be adding anything in gelbooru's block?
Thanks.