Open biggestsonicfan opened 11 months ago
"event": "post"
triggers once for each "post" / Tweet / container-like thing that can contain files. This happens before any files got processed, so you don't get any filenames or paths, and it also doesn't matter how many files this post contains.
"event": "prepare"
(and file
, after
, skip
) triggers for each file download. It therefore does not trigger when there are no files, but there is filename metadata available when it does.
Do I need a separate
prepare
andpost
json metadata grabber
Depending on what exactly you want to achieve, you might need both.
Okay, after reading your response, downloading a tweet twice with both prepare
and post
, comparing the json files, then rereading your response, I finally get it.
What I do want is a json metadata file per downloaded file from Patreon but I also getting the text posts. This could get tricky...
Can we add a test to test_postprocessor.py
to spit out the unique (or otherwise not null) entries per prepare
and post
processes for a given/each extractor?
That can already be done with -K
or -j
.
post
has directory metadata, prepare
has file metadata.
For Patreon in particular, the difference between the two is hash
, type
, num
, filename
, and extension
when going by the code.
How much of an undertaking would it be to create a consolidated
metadata category, where it's populated by the premetadata, processes normally extracting metadata as it goes along (if any downloads occur), then replaces any None
values with post metadata and adds new keys?
I currently use this setup for grabbing all metadata into json files for parsing:
If I use
"event": "prepare",
, I can not download text-only posts from Patreon. If I use"event": "post",
filenames for Twitter video files are parsed asNone
and the extension type is not displayed when a duplicate file is detected in the command line.Do I need a separate
prepare
andpost
json metadata grabber or are these just bugs?