kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.21k stars 5.32k forks source link

Upgrade OpenFST to the latest C++14 version #4565

Open joshuar9 opened 3 years ago

joshuar9 commented 3 years ago

I'm trying to run through the mini_librispeech/s5 example to get a better understanding on how to build models and have encountered an error. I built Kaldi using the MKL math library, and I'm seeing this issue in the relevant log:

# lattice-lmrescore --lm-scale=-1.0 "ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.1.gz|" "fstproject --project_output=true data/lang_test_tgsmall/G.fst |" ark:- | lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa "ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.1.gz"
# Started at Tue Jun 15 00:04:31 UTC 2021
#
lattice-lmrescore --lm-scale=-1.0 'ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.1.gz|' 'fstproject --project_output=true data/lang_test_tgsmall/G.fst |' ark:-
lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa 'ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.1.gz'
FATAL: SetFlags: Bad option: --project_output=true
ERROR: FstHeader::Read: Bad FST header: fstproject --project_output=true data/lang_test_tgsmall/G.fst |. Magic number not matched. Got: 0
ERROR (lattice-lmrescore[5.5.873~1-75ec]:ReadFstKaldi():kaldi-fst-io.cc:35) Reading FST: error reading FST header from 'fstproject --project_output=true data/lang_test_tgsmall/G.fst |'

[ Stack-Trace: ]
lattice-lmrescore(kaldi::MessageLogger::LogMessage() const+0xa18) [0x5463b2]
lattice-lmrescore(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x47da63]
lattice-lmrescore(fst::ReadFstKaldi(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x1af) [0x4d012f]
lattice-lmrescore(main+0x20c) [0x478cf3]
/lib64/libc.so.6(__libc_start_main+0xea) [0x7fe66a53b0ba]
lattice-lmrescore(_start+0x2a) [0x478a4a]

WARNING (lattice-lmrescore[5.5.873~1-75ec]:Close():kaldi-io.cc:515) Pipe fstproject --project_output=true data/lang_test_tgsmall/G.fst | had nonzero return status 256
kaldi::KaldiFatalErrorLOG (lattice-lmrescore-const-arpa[5.5.873~1-75ec]:main():lattice-lmrescore-const-arpa.cc:117) Done 0 lattices, failed for 0
# Accounting: time=0 threads=1
# Ended (code 1) at Tue Jun 15 00:04:31 UTC 2021, elapsed time 0 seconds
exp/tri3b/decode_tglarge_dev_clean_2/log/rescorelm.1.log (END)

All of the log files of the form exp/tri3b/decode_tglarge_dev_clean_2/log/rescorelm.*.log seem to have the same error.

It seems to be a bad command,

lattice-lmrescore --lm-scale=-1.0 "ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.10.gz|" "fstproject --project_output=true data/lang_test_tgsmall/G.fst |" ark:- | lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa "ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.10.gz"

when I run the command by itself (after running source path.sh) outside of run.sh, it throws the same error.

I don't understand the format of the command enough to correct it myself, nor how to fix it in the script.

danpovey commented 3 years ago

Do fstproject --help and show the output; and also figure out your OpenFst version; and do $(which fstproject) after sourcing path.sh (. ./path.sh) to see where you are getting it from. Perhaps they have removed the --project_output flag. Or perhaps you installed OpenFst via system commands somehow.

joshuar9 commented 3 years ago

fstproject --help:

(base) [ec2-user@ip-172-31-39-167 s5]$ fstproject --help
Projects a transduction onto its input or output language.

  Usage: fstproject [in.fst [out.fst]]

PROGRAM FLAGS:

  --project_type: type = std::string, default = "input"
  Side to project from, one of: "input", "output"

LIBRARY FLAGS:

Flags from: flags.cc
  --help: type = bool, default = false
  show usage information
  --helpshort: type = bool, default = false
  show brief usage information
  --tmpdir: type = std::string, default = "/tmp"
  temporary directory
  --v: type = int32, default = 0
  verbosity level

Flags from: fst.cc
  --fst_align: type = bool, default = false
  Write FST data aligned where appropriate
  --fst_default_cache_gc: type = bool, default = true
  Enable garbage collection of cache
  --fst_default_cache_gc_limit: type = int64, default = 1048576
  Cache byte size that triggers garbage collection
  --fst_read_mode: type = std::string, default = "read"
  Default file reading mode for mappable files
  --fst_verify_properties: type = bool, default = false
  Verify FST properties queried by TestProperties
  --save_relabel_ipairs: type = std::string, default = ""
  Save input relabel pairs to file
  --save_relabel_opairs: type = std::string, default = ""
  Save output relabel pairs to file

Flags from: symbol-table.cc
  --fst_compat_symbols: type = bool, default = true
  Require symbol tables to match when appropriate
  --fst_field_separator: type = std::string, default = "         "
  Set of characters used as a separator between printed fields

Flags from: util.cc
  --fst_error_fatal: type = bool, default = true
  FST errors are fatal; o.w. return objects flagged as bad: e.g., FSTs: kError property set, FST weights: not a Member()

Flags from: weight.cc
  --fst_weight_parentheses: type = std::string, default = ""
  Characters enclosing the first weight of a printed composite weight (e.g., pair weight, tuple weight and derived classes) to ensure proper I/O of nested composite weights; must have size 0 (none) or 2 (open and close parenthesis)
  --fst_weight_separator: type = std::string, default = ","
  Character separator between printed composite weights; must be a single character

I grepped config.h under/home/ec2-user/kaldi/tools/openfst

(base) [ec2-user@ip-172-31-39-167 openfst]$ grep -i version config.h
/* If available, contains the Python version number currently in use. */
/* Define to the full name and version of this package. */
/* Define to the version of this package. */
#define PACKAGE_VERSION "1.8.0"
/* Version number of package */
#define VERSION "1.8.0"

The version seems to be 1.8.0

And as for which fstproject:

(base) [ec2-user@ip-172-31-39-167 s5]$ which fstproject
~/kaldi/tools/openfst/bin/fstproject

It seems to be using the fstproject program built under kaldi.

I also tried running the problematic command by itself.

(base) [ec2-user@ip-172-31-39-167 s5]$ fstproject --project_output=true data/lang_test_tgsmall/G.fst
FATAL: SetFlags: Bad option: --project_output=true
(base) [ec2-user@ip-172-31-39-167 s5]$ ls -lah data/lang_test_tgsmall/G.fst
-rw-rw-r-- 1 ec2-user ec2-user 30M Jun 14 23:37 data/lang_test_tgsmall/G.fst

It does look like --project_output is no longer a valid option. Which openfst version should I be using?

Perhaps this is the relevant replacement flag

 --project_type: type = std::string, default = "input"
  Side to project from, one of: "input", "output"

This command returned a lot of binary output: fstproject --project_type=output data/lang_test_tgsmall/G.fst I'm guessing that's the update that should be made.

joshuar9 commented 3 years ago

Just for completeness, making that change where it failed also seems to've worked:

(base) [ec2-user@ip-172-31-39-167 s5]$ lattice-lmrescore --lm-scale=-1.0 "ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.10.gz|" "fstproject --project_type=output data/lang_test_tgsmall/G.fst |" ark:- | lattice-lmrescore-const
-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa "ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.10.gz"
lattice-lmrescore --lm-scale=-1.0 'ark:gunzip -c exp/tri3b/decode_tgsmall_dev_clean_2/lat.10.gz|' 'fstproject --project_type=output data/lang_test_tgsmall/G.fst |' ark:-
lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- data/lang_test_tglarge/G.carpa 'ark,t:|gzip -c>exp/tri3b/decode_tglarge_dev_clean_2/lat.10.gz'
LOG (lattice-lmrescore[5.5.873~1-75ec]:main():lattice-lmrescore.cc:148) Done 104 lattices, failed for 0
LOG (lattice-lmrescore-const-arpa[5.5.873~1-75ec]:main():lattice-lmrescore-const-arpa.cc:117) Done 104 lattices, failed for 0

Is this the expected correct outcome?

danpovey commented 3 years ago

OK, so it looks like the OpenFst guys have changed the command line options. Unfortunately they don't seem to have provided any time to change over, by accepting both the old and new options. @jtrmal @kkm000 we'll have to figure out how to deal with this. OpenFst tools don't seem to have a --version option, but we can still run with --help 2>&1 and grep for the option name, to see which option name it has. I think something like that would be better than requiring a specific version (unless we have already had to make non-compatible C++ changes, in which case we can just change all the scripts).

kkm000 commented 3 years ago

The API had been also changed between 1.7.2 and 1.7.7.

My take is just require 1.8.0+. @danpovey, no, we haven't integrated the new API. So I do not know what to make of @joshuar9 issue--the current codebase would not even compile with 1.8.0. #4533 had been taken, but I don't see any activity happening since.

galv commented 3 years ago

Openfst is a C++17 library as of 1.8.0. Since it's a template library, C++17-isms will probably leak into the kaldi source code, so we should probably migrate to C++17 at the same time.

Honestly, openfst breaking their API seems moronic to me. It is straightforward to make things backwards compatible.

In my view, We can either (1) patch openfst to support the old command-line options. This naively doesn't seem that complicated.

Or (2) we can use some "sed" or codemod changes in the entire kaldi codebase to change the openfst binary invocations. Either way, we would probably need to do some diffs in the bin/ directory of openfst from 1.7.2 to 1.8.1. We could then just document the exact ways that we ran codemod or sed for the sake of power users of kaldi who also need to migrate proprietary code (and I think it is fair to see that most kaldi users are power users at this point).

kkm000 commented 3 years ago

@galv, you can look at and analyse changes here: https://github.com/kkm000/openfst/tree/original. This is a completely unmodified OpenFST source (with only .gitignore and ./_import_release.sh added to the root of the repo). Each release tagged like orig/1.8.0.1. The last digit corresponds to the revision, which were used at a time my import started (see the revision column in the table at the very bottom of the page http://www.openfst.org/twiki/bin/view/FST/FstDownload). Warm up your Git and look at look at the changes to your hart content. It would be interesting to know if any more command-line switches changed.

Kaldi codebase is entirely C++17-compatible. What we're missing is a good CMake build framework to detect compiler features.

I do not know about API changes since 1.7.2. The next Windows port is going to be 1.7.3, but I do not remember why I had to put it aside: either I run into something nasty, or just did not have time for that. @jtrmal wrote a full CMake build framework, and I maintain only minimal VS projects to build the base library (libfst) and the binaries (and, by extension, libfstscript or how it's called). Normally, any OS version can be built from the winport branch with Yenda's CMake files, not only Windows. I currently have absolutely no time to advance the port beyond 1.7.2. There are whole 3 branches that I started, named X-1.7.3. I think I'll move them into wip/ namespace, at the very least to avoid confusing people.

The NEWS file for 1.8.0 shows "* Migration to C++17 (1.8.0)"; I suppose 1.7.9 is C++14. What I propose, given that both I and Yenda are swamped beyond reasonable right now is:

  1. Assuming 1.7.9 can still be configured with ./configure and made with make, let's upgrade to it and stick with it for a while.
  2. If there is only one switch to one program has changed its name without changing the semantics, it's a bash 3-liner compatibility wrapper to rename it. An important thing is to get it on the PATH before the binary, and at the same time to leave the fstproject on the PATH so the wrapper can find it.

What I want to really avoid is maintaining patches to OpenFST. We did that; because of it an upgrade was nearly impossible.

If you want to look into compile problems of Kaldi with FST 1.7.9, i.e. how much API has changed, I'd be very grateful. The change that I know of was in the SymbolTable class, and it was the way the symbol table is enumerated and queried. We do not use symbol tables much, so this should be an easy fix. It's probably too late to try to come up with SNAFU SFINAE magic attempting to support this change; we can just declare 1.7.2 too old to support. It will come back to bite us when ppl use older compilers, etc. Besides, OpenFST started requiring C++14 at some point (1.7.3, I believe), so the newer of (API break, 1.7.3) is our support cutoff point.

This will give us a breathing room. I'm thinking of adding a parallel cmake branch to the repo, with only the CMake build files and no Windows-specific changes. Naturally, I'll fork it at the oldest supported version, and keep tracking forward, along with the Windows version, because the CMake files are identical, it's a very low-maintenance task. This coluld will pay back if we screw up on the Windows branch so that it causes issues on other OSes: we (meaning Yenda and I) had to contort code so that VS compilers did not object to it, and it's not impossible to make it in a way that would make an unrelated compiler unhappy. This is why I'll add such a branch. But this is for later, a month or so at best; right now we need an interim compromise solution.

Sticking with 1.7.2 for a bit more is also an option. It's not broken at all. Personally, I'm having no issues using it.

@joshuar9, it would be very enlightening to understand what exactly prompted you to upgrade as far forward as 1.8.1? You had to modify Kaldi code, I reckon. Or did you compile against 1.7.2, but use the 1.8.1 binaries? Before solving the problem, I want to make sure we really have a problem.


@galv, tangentially, while I have you, if you have time and feel comfortable to review the CUDA decoder part of #4560, I'd really appreciate it, but this one is not urgent at all. And if you know the guts of the decoder and have any ideas what could be causing the three-is-a-charm issue #4556, I'd jump with joy. Possibly literally.

joshuar9 commented 3 years ago

I'm trying to recall all the steps I took to get here myself. lol.

I started by following a Vosk docker definition file https://github.com/alphacep/vosk-server/blob/master/docker/Dockerfile.kaldi-vosk-server

Then I did git clone -b lookahead-1.8.0 --single-branch https://github.com/alphacep/kaldi in my home directory.

I repeated this just awhile ago, and when I look at kaldi/tools/Makefile, I see:

# Note: OpenFst requires a relatively recent C++ compiler with C++11 support,
# e.g. g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3.
OPENFST_VERSION ?= 1.8.0
CUB_VERSION ?= 1.8.0

So I guess this branch has Openfst at 1.8.0. After performing make in the kaldi/tools directory, I get a folder openfst-1.8.0.

Ohhh. I just noticed. This is for a fork of the kaldi project alphacep/kaldi vs kaldi-asr/kaldi. And sure enough https://github.com/alphacep/kaldi/blob/lookahead-1.8.0/tools/Makefile Has the openfst version of 1.8.0.

kkm000 commented 3 years ago

@joshuar9, aha, got it. @nshmyrev, if you've already ported Kaldi to FST 1.8, we'll likely better off just stealing your changes back and living happily ever after... Anything to be aware of, specifically?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.