fgnt / sms_wsj

SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition
MIT License
101 stars 23 forks source link

Argument list too long #31

Open jensen201113 opened 4 months ago

jensen201113 commented 4 months ago

hi, when I run make WSJ_DIR=/hy-tmp/wsj SMS_WSJ_DIR=/hy-tmp/sms_wsj, I got a " Argument list too long " error, I looked into the source code and do some debug. It seems that dirty.txt is too large for sh.perl which called by normalize_transcription in create_json.py. The max length for sh.perl is 128K, but dirty.txt is over 10M long. How could I fix it ?? by the way, the linux kernel version is 5.4.0-146-generic.

Here are the whole terminal outputs ////////////////// Teminal outputs //////////////////////////////// creating /hy-tmp/sms_wsj/wsj_8k_zeromean.json python -m sms_wsj.database.wsj.create_json \ with json_path=/hy-tmp/sms_wsj/wsj_8k_zeromean.json database_dir=/hy-tmp/sms_wsj/wsj_8k_zeromean as_wav=True WARNING - Create wsj json - No observers have been added to this run INFO - Create wsj json - Running command 'create_database' INFO - Create wsj json - Started INFO - sh.command - <Command '/usr/bin/cat /tmp/tmprgsph81i/dirty.txt', pid 1008>: process started ERROR - Create wsj json - Failed after 0:00:11! Traceback (most recent calls WITHOUT Sacred internals): File "/root/sms_wsj/sms_wsj/database/wsj/create_json.py", line 293, in create_database transcriptions = get_transcriptions(database_dir, database_dir) File "/root/sms_wsj/sms_wsj/database/wsj/create_json.py", line 170, in get_transcriptions data_dict["clean word"] = normalize_transcription(word, wsj_root) File "/root/sms_wsj/sms_wsj/database/wsj/create_json.py", line 192, in normalize_transcription result = sh.perl( File "/root/.local/lib/python3.8/site-packages/sh.py", line 1508, in call rc = self.class.RunningCommandCls(cmd, call_args, stdin, stdout, stderr) File "/root/.local/lib/python3.8/site-packages/sh.py", line 720, in init self.process = OProc( File "/root/.local/lib/python3.8/site-packages/sh.py", line 2157, in init raise ForkException(fork_exc) sh.ForkException:

Original exception:

Traceback (most recent call last):
  File "/root/.local/lib/python3.8/site-packages/sh.py", line 2110, in __init__
    os.execv(bytes_cmd[0], bytes_cmd)
OSError: [Errno 7] Argument list too long
boeddeker commented 4 months ago

Hi,

thanks for reporting.

In sh=2.0.0 is a breaking change. They changed the return type of commands.

Two solutions:

jensen201113 commented 4 months ago

@boeddeker problem solved, thank you