chapmanjacobd / library

80+ CLI tools to build, browse, and blend your media library: an index for your archive.
BSD 3-Clause "New" or "Revised" License
355 stars 9 forks source link

Full video description not recorded in database #23

Closed deldesir closed 8 months ago

deldesir commented 8 months ago

Describe the bug Upon initiating a video download using the lb dl command, it has come to attention that the complete description/caption of the video is no longer being recorded in the description field within the database. Only the web path is stored in this field.

To Reproduce

  1. Execute the following command in the terminal using xklb:
    lb dl /path/to/database.db --video https://youtu.be/example-video-id --verbose
  2. Examine the description field in the database to confirm whether the full description is accurately recorded.

Expected behavior The entire description of the downloaded video should be captured in the description field of the database entry.

Screenshots N/A

Desktop (please complete the following information):

Additional context The absence of the full description in the database obstructs users from obtaining available information about the downloaded videos, as highlighted in this reported case.

holta commented 8 months ago

@deldesir would you happen to know when this change/regression was first noticed — sometime within the past week presumably?

deldesir commented 8 months ago

@holta I don't have an exact timestamp for when the issue started, but I can confirm that the last known working state was on December 6 at 21:01. The problem was identified after this point. If there's a need for more context, I can provide additional details based on my tests, but I cannot narrow down the timeframe further.

chapmanjacobd commented 8 months ago

It should be recorded in the captions table. The description column was moved out of the media table because the table was getting too too big and querying was becoming really slow for large databases. It's been like this since about six months ago (v1.30.001 or so): https://github.com/chapmanjacobd/library/blame/9e27264b2a3783130c34d06ed79454bc5ed84f4e/xklb/fs_extract.py#L181

If you are searching the description, the captions table should be fine, but if are pulling out the description to display somewhere else you should probably use ffprobe on the downloaded file (see reformat_ffprobe(path) in xklb.scripts.playback_control for an example) or --write-info-json to save a JSON file: https://github.com/yt-dlp/yt-dlp#filesystem-options

If that's not a good solution for you then I suggest copying some of the code out into your own software instead of calling library directly, or look into other solutions, perhaps https://github.com/tubearchivist/tubearchivist

holta commented 8 months ago

@chapmanjacobd That's an incredibly thoughtful response:

All 3 of your workaround suggestions are in fact extremely relevant, Thank you!

CONTEXT: Low-income educators very often have smaller/targeted video collections (in contrast to the "archivist" and "hoarding" demographics, for example). So we did not realize the description column in the media table would slow things down unacceptably for others! :upside_down_face: