peter-makarov / nn_lms

Neural language models
1 stars 0 forks source link

LM training #2

Open simon-clematide opened 5 years ago

simon-clematide commented 5 years ago

export LNG=de && python3 lm/preprocessing.py with configs/${LNG}_word_pp.json >configs/${LNG}_word_pp.json.output

export LNG=es && CUDA_VISIBLE_DEVICES=3 nohup python3 lm/trainer.py with configs/${LNG}_word_lm.json &> logs/${LNG}_word_lm.log &

WORD LMs

ON s3it

CHAR LMs

export LNG=XXXX && python3 lm/preprocessing.py with configs/${LNG}_char_pp.json >configs/${LNG}_char_pp.json.output

export LNG=XXXX && CUDA_VISIBLE_DEVICES=666 nohup python3 lm/trainer.py with configs/${LNG}_char_lm.json &> logs/${LNG}_char_lm.log &

-en: gpu2 export LNG=en && CUDA_VISIBLE_DEVICES=2 nohup python3 lm/trainer.py with configs/${LNG}_char_lm.json &> logs/${LNG}_char_lm.log &

deleting the corpus output directory of a language

rm -r wplmdata-preprocessed/sl/

simon-clematide commented 5 years ago

overview on models

(https://gitlab.ifi.uzh.ch/siclemat/neural-wp-language-models-for-hististorical-normaliization)

lang mtype status best valid ppl training start training end duration
de word done 63.68 2019-05-07 02:11:03 2019-05-14 14:42:46 7 days 12 hrs 31 mins 43 secs
de char done 2.71 2019-05-17 19:16:37 2019-05-24 17:00:37 6 days 21 hrs 44 mins
en word
en char
es word done 49.97 2019-05-06 18:27:23 2019-05-14 14:37:03 7 days 20 hrs 9 mins 40 secs
es char done 2.54  2019-05-17 01:57:45 2019-05-24 21:30:44 7 days 19 hrs 32 mins 59 secs
is word done 105.95 2019-05-10 14:00:50 2019-05-14 18:06:47 1 day 4 hrs 49 mins 43 secs
is char done 2.97 2019-05-14 15:46:19 2019-05-15 20:36:02 1 day 4 hrs 49 mins 43 secs
pt word done 61.69 2019-05-06 18:09:35 2019-05-14 14:42:20 7 days 20 hrs 32 mins 45 secs
pt char done 2.65 2019-05-15 17:37:54 2019-05-23 07:27:57 7 days 13 hrs 50 mins 3 secs
sl word done 83.17 2019-05-06 17:58:21 2019-05-14 14:42:33 7 days 20 hrs 44 mins 12 secs
sl char done  2.87 2019-05-14 17:32:12 2019-05-19 22:13:10 5 days 4 hrs 40 mins 58 secs
sv word done 12.44 2019-05-16 02:00:53 2019-05-24 21:10:12 8 days 19 hrs 9 mins 19 secs
sv char onrattle   Fri 24 May 23:55:03
(venv) siclemat@rattle:~/nnlm-2019/nn_lms/models.d/pt_char_lm$ grep -oP "valid ppl .{8}" < loss.txt|sort -rn

(http://www.grun1.com/utils/dateTimeDiff.cfm)

simon-clematide commented 5 years ago

Start slurm job on s3it

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --time=7-0:00:00
#SBATCH --gres gpu:Tesla-K80:1 --mem=10000
#SBATCH --cpus-per-task=2
#SBATCH --mail-type=ALL        # notifications for job done & fail
#SBATCH --mail-user=simon.clematide@uzh.ch # send-to address
#SBATCH -p vesta
#SBATCH --qos vesta
#SBATCH -A uzh
conda activate p36lm2019
python3 lm/trainer.py with configs/is_word_lm.json &> logs/is_word_lm.log