wildyeast / sadiss

A socially aggregated distributed sound system.
GNU Affero General Public License v3.0
10 stars 2 forks source link

Streamline upload of SRT files for TTS #116

Open matbind opened 5 months ago

matbind commented 5 months ago

Currently one has to labouriously click upload for every voice in every language. Drag and drop would make it easier, ideally one could just drop the whole folder and we assign the files to their voices by filename. Or maybe there is another way which doesn't force the user to follow a file name schema? Need to think about this.

wildyeast commented 5 months ago

Can the data required for mapping be read from the file contents?

matbind commented 5 months ago

I don't think we can. There is not a lot of information in the files themselves sadly. This is what an SRT file looks like:

1
00:00:03,603 --> 00:00:06,906
This is the first line.

2
00:00:16,983 --> 00:00:19,552
This is the second line.

3
00:00:36,069 --> 00:00:38,405
This is the third line.
KlienVo commented 5 months ago

I think it would need to be in some sort of naming convention. Probably something like:

trackname_us-EN_00.srt

trackname_de-DE_00.srt

trackname_us-EN_01.srt

trackname_de-DE_01.srt

..

Does this sound reasonable?

KlienVo commented 5 months ago

actually, it would need to be rather something like this: Trackname#ID-Name#Language-codes_#ID-Number.srt

or am i confused?

wildyeast commented 5 months ago

Would it be possible to add a block of mapping information on top of the individual files? Something like

{
  language: "en-US",
  id: 0
}

1
00:00:03,603 --> 00:00:06,906
This is the first line.

2
00:00:16,983 --> 00:00:19,552
This is the second line.

3
00:00:36,069 --> 00:00:38,405
This is the third line.

In this way, we would not force users to adhere to complicated naming standards, which would possibly be more annoying than the way upload is handled now.

KlienVo commented 4 months ago

Wouldn’t his would break our workflow (exporting SRTs from DaVinci & Reaper).

matbind commented 4 months ago

I think we can't get around using a file naming convention. We currently need the voice id and the language the file is associated with. We could use a naming convetion like this:

Here each attribute is associated with its value via = and the attribute-value pairs are separeted by _. This has the advantage that

A disadvantage is that it is somewhat hard to type by hand.

The quickest way of streamlining the SRT upload (I can think of right now) would be another button in the uploading interface which allows uploading whole folders. All files in the folder would then be assigned to their associated voice-lang combination, if the correct naming convention is followed. After this step users can still upload files for individual voice-lang combinations if they want. Otherweise the track creation workflow remains the same.

@KlienVo @wildyeast @tobiasleibetseder Any comments on this?

wildyeast commented 4 months ago

While I can't see any issues arising from using an equals sign in a filename, I've certainly never seen it in live applications.

Another option would be to add a manifest / metadata file that is parsed when uploading the folder. The file would contain the mapping information:

[
  {
    voice: 1,
    language: 'en-US',
    filename: 'english.srt'
  },
  {
    voice: 2,
    language: 'de-AT',
    filename: 'german.srt'
  }
]

This approach builds upon the advantages described in @matbind's post (arbitrary structure, extendability, additional information) by offering readability, ease of writing, and maintenance of order.

matbind commented 4 months ago

Another idea we came up with is to just use

This is a very simple approach, which does not have the advantages mentioned above but in turn is pretty easy to use from a user experience point of view.