mipops / dvrescue

Archivist-made software that supports data migration from DV tapes into digital files suitable for long-term preservation. Snapshot daily builds are at https://mediaarea.net/download/snapshots/binary/dvrescue/.
BSD 3-Clause "New" or "Revised" License
91 stars 20 forks source link

add prefix/suffix options for file names #752

Open libbyhopfauf opened 10 months ago

libbyhopfauf commented 10 months ago

It would be really helpful if there were prefix/suffix options for naming file prior to capture (similar to vrecord) to make that part more efficient. In dvcapture, currently will let you copy/paste in the output file name field, but that's a little tedious. It would be helpful to have this as part of packager as well so that users can name the output files in a manner that adheres to their organizations naming conventions.

libbyhopfauf commented 7 months ago

Another option that was requested by a user recently is to have packager name the files based on the recording date (if the files are being split this way). For example, if an original file was named "Item_1234.dv" and contained three recording dates (May 5, 2002; May 6, 2002 and May 7, 2002), then if you selected "recording date" for the segmentation perameters, the output files would be:

dericed commented 7 months ago

Thx for nudging this issue. Keep in mind that for any naming convention that doesn't include an incrementing number or uuid then we need a policy for handling outputs that would have the same name. Note that not all dv recordings would have recording dates either.

retokromer commented 7 months ago

I suggest to represent dates according to ISO 8601 or RFC 3339.

BriBek commented 7 months ago

As someone who doesn’t know how to code I’ll probably embarrass myself by dipping an oar in here, but it seems that what would be incredibly useful in editing clips imported from dvcapture would be radio buttons allowing for a choice.

For example:

  1. [capture clip name]+YYYY_MM_DD
  2. [clip name]+YYYY_MM_DD+recording time
  3. (maybe a timecode option?)
  4. None, just number sequentially as it does now.

And a checkbox option for adding UTC offset.

Again, sorry to pipe up! I’m just a new DVRescue user who is floored by how good and useful this program is for a project I’m working on that has hundreds of MiniDV tapes resulting in many thousands of clips.

dericed commented 7 months ago

Thx @BriBek,

So the values we'd need to make the filename would be?

BriBek commented 7 months ago

That sounds quite comprehensive, dericed! It would be terrific to have those options.

libbyhopfauf commented 7 months ago

Thx for nudging this issue. Keep in mind that for any naming convention that doesn't include an incrementing number or uuid then we need a policy for handling outputs that would have the same name. Note that not all dv recordings would have recording dates either.

Noted on the sequential for the same number. In this case, maybe we could add the full time/date? If I was interpreting correctly, I believe this was just for instances where someone is splitting according to date via the segmentation rules.

libbyhopfauf commented 7 months ago

Thx for nudging this issue. Keep in mind that for any naming convention that doesn't include an incrementing number or uuid then we need a policy for handling outputs that would have the same name. Note that not all dv recordings would have recording dates either.

Noted on the sequential for the same number. In this case, maybe we could add the full time/date? If I was interpreting correctly, I believe this was just for instances where someone is splitting according to date via the segmentation rules.

Never mind, looks like @BriBek explained it :)

libbyhopfauf commented 7 months ago

Another option that was requested by a user recently is to have packager name the files based on the recording date (if the files are being split this way). For example, if an original file was named "Item_1234.dv" and contained three recording dates (May 5, 2002; May 6, 2002 and May 7, 2002), then if you selected "recording date" for the segmentation perameters, the output files would be:

  • Item_1234_05-05-2002.mkv
  • Item_1234_05-06-2002.mkv
  • Item_1234_05-07-2002.mkv

Would prefer YYYY-MM-DD, just to clarify, but wasn't sure how that information was presented/collected and if that would be problem :)

BriBek commented 7 months ago

I don’t know if having incremental numbers and timestamps is redundant, but it’s probably handy organizationally just because it would look better in a list of file names. So, yes, clip number, then date, though year is probably a more handy appearing first because otherwise long lists of files will list in order of the month and years will not be properly sequential.

Item_1234_YYYY-MM-DD.mkv

Then as an additional option from checkboxes:

Item_1234_YYYY-MM-DD_HH_MM_SS.mkv

Or even, if a user selects it, timecode

Item_1234_YYYY-MM-DD_HH_MM_SS_timecode

So, it could have a list of checkboxes to include those and to select formatting of date.

I feel terrible for suggesting such complications!

BriBek commented 7 months ago

When someone who knows nothing about coding imagines a convenient way to encounter this on the packaging page this comes to mind:

Filename options menu

[radio button] Default (DVPackager will number clips sequentially and add nothing). Everything below is dimmed or invisible if selected:

[radio button] Customize filename with: [options dimmed until checked]

       [check]  Date

if checked:

      [radio button] YYYY-MM-DD

      [radio button] YYYY-DD-MM

      [radio button] MM-DD-YYYY

      [radio button] DD-MM-YYYY

Add time of day to filename?

      [checkbox] 

Add timecode to filename?

[checkbox]

Cherry on top would be if the preferences could be made sticky.

BriBek commented 7 months ago

just for reference in case it’s convenient to see, here are a few lines of the xml file generated by DVRescue. Date and time are nicely combined. There could be a much simpler solution with just one checkbox to copy that as-is to the filename.

Prefix should default to the filename specified during capture, or might there not be confusion downstream somewhere?

beker@Mac-Mini ~ % dvpackager -s -a c -T /Volumes/D2\ -\ Video/MiniDV\ imports/DVRescue\ imports/Nepal\ Digitizations/Nepal\ CR-36/Nepal\ CR-36.dv.dvrescue.xml
00:00:00.000000|00:00:45.311933|0|1357|00:00:00;02|0|1997-12-2116:02:33|720x480|30000/1001|4:1:1|4/3|32000|4||||-|162960000|-|./_part1.mov 00:00:45.311933|00:01:44.137366|1358|3120|00:00:45;10|162960000|1997-12-2116:03:26|720x480|30000/1001|4:1:1|4/3|32000|4|1|1||-|374520000|-|./_part2.mov 00:01:44.137366|00:02:06.626500|3121|3794|00:01:44;05|374520000|1997-12-2116:12:46|720x480|30000/1001|4:1:1|4/3|32000|4|1|1||-|455400000|-|./_part3.mov

[It might be a nice adjustment if there were a separator between date and time for readability]

retokromer commented 7 months ago

[It might be a nice adjustment if there were a separator between date and time for readability]

T is the canonic separator. Internally we use _ because we consider it more readable.

BriBek commented 7 months ago

[It might be a nice adjustment if there were a separator between date and time for readability]

T is the canonic separator. Internally we use _ because we consider it more readable.

I was referring to how the presentation of the date and time in the xml file, where, using the dvpackager commands to translate it there was no separator.

dericed commented 7 months ago

Thx @BriBek, that -T option is actually rather internal and is used to communicate data from the dvpackager cli to the gui. So the info there is presented in a pipe separated list and then shown in the packaging page like this:

image

I had uses a process that would cleanup the spaces in the incoming data but forgot that the recording timestamps could have spaces. I fixed that here. https://github.com/mipops/dvrescue/pull/821/files

I'm curious why you use dvpackager -T file.dv.dvrescue.xml rather than dvpackager -n file.dv. Here's the two outputs:

00:00:00.000000|00:00:01.968633|0|58|02:00:00:00|0|1970-01-01 00:00:00|720x480|30000/1001|4:1:1|16/9|48000|2||||-|7080000|-|./_part1.mov
00:00:01.968633|00:00:03.937266|59|117|01:00:00:00|7080000|1970-01-01 00:00:00|720x480|30000/1001|4:1:1|4/3|48000|2||2|2|-|14160000|-|./_part2.mov
00:00:03.937266|00:00:07.937266|118|217|04:00:00:00|14160000|1970-01-01 00:00:00|720x576|25|4:1:1|4/3|48000|2||2|1|-|28560000|-|./_part3.mov
00:00:07.937266|00:00:09.937266|218|267|05:00:00:00|28560000|1970-01-01 00:00:00|720x576|25|4:2:0|4/3|48000|2||2|1|-|35760000|-|./_part4.mov
00:00:09.937266|00:00:11.905900|268|326|06:00:00:00|35760000|1970-01-01 00:00:00|720x480|30000/1001||4/3|48000|2||2|1|-|49920000|-|./_part5.mov
00:00:11.905900|00:00:13.905900|327|376|07:00:00:00|49920000|1970-01-01 00:00:00|720x576|25||4/3|48000|2||2|1|-|64320000|-|./_part6.mov

vs

Analyzing mix.dv
# Segmentation options. Split on: Recording_Start_Marker = 0, Recording_Timestamp_Jump = 0, Timecode_Jump = 0, Audio_characteristics_change = 0, Aspect_Ratio_change = 1
#         St='Flagged Start of a recording', ncTC='non-continuous timecode value', ncR='non-continuous recording timestamp value'
  # | PTS Range                         | Duration | Frame Range         | Byte Range                | Timecode    | Recording Timestamp    | Size      | Frame Rate | DAR   | ChSub | Audio     | St | ncTC | ncR |
  1 | 00:00:00.000000 - 00:00:01.968633 |    1.969 |        0 -       58 |           0 -     7080000 | 02:00:00:00 | 1970-01-01 00:00:00    |   720x480 | 30000/1001 |  16/9 | 4:1:1 | 2ch 48000 |    |      |     |
  2 | 00:00:01.968633 - 00:00:03.937266 |    1.969 |       59 -      117 |     7080000 -    14160000 | 01:00:00:00 | 1970-01-01 00:00:00    |   720x480 | 30000/1001 |   4/3 | 4:1:1 | 2ch 48000 |    |    2 |   2 |
  3 | 00:00:03.937266 - 00:00:07.937266 |    4.000 |      118 -      217 |    14160000 -    28560000 | 04:00:00:00 | 1970-01-01 00:00:00    |   720x576 |         25 |   4/3 | 4:1:1 | 2ch 48000 |    |    1 |   2 |
  4 | 00:00:07.937266 - 00:00:09.937266 |    2.000 |      218 -      267 |    28560000 -    35760000 | 05:00:00:00 | 1970-01-01 00:00:00    |   720x576 |         25 |   4/3 | 4:2:0 | 2ch 48000 |    |    1 |   2 |
  5 | 00:00:09.937266 - 00:00:11.905900 |    1.969 |      268 -      326 |    35760000 -    49920000 | 06:00:00:00 | 1970-01-01 00:00:00    |   720x480 | 30000/1001 |   4/3 |       | 2ch 48000 |    |    1 |   2 |
  6 | 00:00:11.905900 - 00:00:13.905900 |    2.000 |      327 -      376 |    49920000 -    64320000 | 07:00:00:00 | 1970-01-01 00:00:00    |   720x576 |         25 |   4/3 |       | 2ch 48000 |    |    1 |   2 |
BriBek commented 7 months ago

@dericed Oh, man… I wish I’d known about that!

I wasn’t kidding when I said that I had no idea what I was doing. I don’t know how to code and even using Terminal is a struggle. If you saw the insane wrangling I went through trying to get some guy’s code to work when using dvpackager from the command line (which was fine, but the GUI is so much sleeker) before I knew the GUI existed, you’d still be on the floor laughing.

I used the command line dvpackager included at the end of the help file it generated in response to bad commands, and since the output, messy as it was, gave me the data I needed (I just put it in a readme file in each tape’s folder), I was glad enough to have it. And the dvpackager-suggested command was for the .xml file and not the .dv file, so I didn’t even know I could regenerate it that way.

So, the answer to why I was using it is I’m an ignoramus when it comes to this stuff. Thanks for the tip. But, oddly, when I run that command on a .dv file I just got through capturing, it doesn’t break it down into the 107 clips that dvpackager sees, just instead into just one clip. I see that it’s because the only thing selected is aspect ratio change, and that doesn’t change on the tape, but it is the default on the packager page that I don’t add the other selections to until I get to the packager window after capturing. I just captured this tape and running the command you gave me I get this:

Screenshot 2024-02-07 at 11 25 31 PM

versus this when I run the other command on the .xml:

Screenshot 2024-02-07 at 11 26 53 PM

Is some change made to the .dv file itself after the options are selected in packager?

In any event, using the GUI I am able to power through this project and it’s really great. I can’t begin to tell you how grateful I am for this. I apologize for not being able to discuss this competently.

By the way, I’m sure you’ve seen this guy’s blog entry, which is how I learned about DVRescue. In it he says he wanted to change the filename output to dates and did it with this code, which was what I spent hours trying to get to work and never could:

Screenshot 2024-02-07 at 11 10 46 PM

dericed commented 5 months ago

Hi @BriBek @libbyhopfauf I opened a pull request on this one. Here's the draft user documentation:

 -O <pattern>
          (specify a pattern for output files. The following variables may be
           used:
          %FILENAME% - will use the filename of the input file without it's
                       extension
          %RECDATE% -  will use the recording date of the first output frame,
                       in YYYY-MM-DD format. If there is no embedded recording
                       date, then 'XXXX-XX-XX' will be used.
          %RECTIME% -  will use the recording date of the first output frame.
                       If there is no embedded recording date, then
                       'XX-XX-XX' will be used.
          %TC% -       will use the timecode value of the first frame or use
                       XX-XX-XX-XX if no timecode is stored in the first frame.
          %PARTNO% -   This is an incrementing number of the output starting
                       from 1.
           The default pattern is "%FILENAME%_part%PARTNO%". The
           extension of the output file is determined by the -e
           setting.)

I still have more to do, such as dealing with cases where the requested data isn't available and handling when the requested naming pattern would cause repeated names. Comments welcome.

libbyhopfauf commented 5 months ago

@dericed this looks great! For the ones where the info isn't there (like the RECDATE is missing), would doing a rolling sequence like packager currently does work? For example:

dericed commented 3 months ago

ok https://github.com/mipops/dvrescue/pull/852 is updated to produce unique output names, ready to test in the CLI

dericed commented 3 months ago

for the gui, we need some suggested naming patterns. How about:

%FILENAME%_part%PARTNO%
%FILENAME%_%RECDATE%_%RECTIME%
%FILENAME%_%PARTNO%_%TC%
%FILENAME%_%RECDATE%-%RECTIME%_%TC%
dericed commented 3 months ago

actually, I started an issue for it at https://github.com/mipops/dvrescue/issues/876

libbyhopfauf commented 3 months ago

actually, I started an issue for it at #876

I added some mock-up examples for the GUI for reference/feedback.