lowerquality / gentle

gentle forced aligner
https://lowerquality.com/gentle/
MIT License
1.44k stars 295 forks source link

Model File Layout #21

Open maxhawkins opened 8 years ago

maxhawkins commented 8 years ago

I wanted to clarify how you see the model files being laid out. I think there's been some confusion and the code as-is doesn't work with the latest model files from lowerquality.com.

Is this the intended file layout?

data
├── nnet_a_gpu_online
│   ├── conf
│   │   ├── ivector_extractor.conf
│   │   ├── ivector_extractor.conf.orig
│   │   ├── mfcc.conf
│   │   ├── mfcc.conf.orig
│   │   ├── online_cmvn.conf
│   │   ├── online_cmvn.conf.orig
│   │   ├── online_nnet2_decoding.conf
│   │   ├── online_nnet2_decoding.conf.orig
│   │   ├── splice.conf
│   │   └── splice.conf.orig
│   ├── final.mdl
│   ├── ivector_extractor
│   │   ├── final.dubm
│   │   ├── final.ie
│   │   ├── final.mat
│   │   └── global_cmvn.stats
│   └── smbr_epoch2.mdl
└── smbr_epoch2.mdl

PROTO_LANGDIR/
├── graphdir
│   ├── phones
│   │   ├── disambig.int
│   │   ├── disambig.txt
│   │   ├── silence.csl
│   │   ├── word_boundary.int
│   │   └── word_boundary.txt
│   ├── phones.txt
│   └── words.txt
├── langdir
│   ├── L.fst
│   ├── L_disambig.fst
│   ├── phones
│   │   ├── disambig.int
│   │   ├── disambig.txt
│   │   ├── silence.csl
│   │   ├── word_boundary.int
│   │   └── word_boundary.txt
│   ├── phones.txt
│   └── words.txt
├── modeldir
│   ├── final.mdl
│   └── tree

I'll update the code to match whatever the correct layout is.

strob commented 8 years ago

Yes, I've been working on making this consistent. I need to take a break now, but in case you want to look at this in the meantime, I've pushed a work-in-progress py2app branch to ebe30a83b3bf4271def2e06ccb5ab79e732e9cb2.

It doesn't completely work yet, but I do think it's close to addressing many of the path issues.

(It treats data/nnet_a_gpu_online as the pristine original, as provided by the lowerquality.com download, and creates derivatives in data/, which is more complicated in the case of running from a .app bundle.)

maxhawkins commented 8 years ago

So the derivatives aren't built in tempfiles anymore? Where does PROTO_LANGDIR live?

strob commented 8 years ago

Changes from your tree dump:

• we no longer have a conf directory in nnet_a_gpu_online (thanks, Max!) • also, you have an extra final.mdl in nnet_a_gpu_online. • there can be an optional data/graph/HCLG.fst file, to allow non-user-supplied transcription.

Otherwise the dump is accurate. PROTO_LANGDIR still needs to be in the CWD, and is copied/modified to tempfiles.

maxhawkins commented 8 years ago

What if PROTO_LANGDIR lived inside data? It would be easier to keep track of one folder.

strob commented 8 years ago

That's a good idea. Ideally, we would have a naming scheme that allowed multiple languages to be supported, and updates gracefully applied.