dylkot / cNMF

Code and example data for running Consensus Non-negative Matrix Factorization on single-cell RNA-Seq data
MIT License
243 stars 57 forks source link

Hello. I have a question about errors related to using GNU parallel. #63

Closed jaewoomo closed 1 month ago

jaewoomo commented 1 year ago

prepared

cmd = '/home/user/anaconda3/envs/jupyter/bin/cnmf prepare --output-dir /home/jupyter/ahwodn/pp/recent/Tumor_analysis/gnu/ --name gnu_malignant_cNMF -c /home/jupyter/ahwodn/pp/recent/Tumor_analysis/malignant_count.tsv -k 5 6 7 8 9 10 --n-iter 200 --total-workers 10 --seed 14 --numgenes 3000 --beta-loss frobenius' print('Command line - prepare step: %s' % cmd) !{cmd}

Using GNU parallel

numworkers = 10 factorize_cmd = '/home/user/anaconda3/envs/jupyter/bin/cnmf factorize --output-dir /home/jupyter/ahwodn/pp/recent/Tumor_analysis/gnu/ --name gnu_malignant_cNMF --worker-index {} ::: 0 1 2 3 4 5 6 7 8 9' print('Factorize command to simultaneously run factorization over %d cores using GNU parallel:\n%s' % (numworkers, factorize_cmd)) !{factorize_cmd}`

When I used 'Using GNU parallel', this error occured. Can you tell me the solution?

Error

Factorize command to simultaneously run factorization over 4 cores using GNU parallel: /home/user/anaconda3/envs/jupyter/bin/cnmf factorize --output-dir /home/jupyter/ahwodn/pp/recent/Tumor_analysis/gnu/ --name gnu_malignant_cNMF --worker-index {} ::: 0 1 2 3 4 5 6 7 8 9 usage: cnmf [-h] [--name [NAME]] [--output-dir [OUTPUT_DIR]] [-c COUNTS] [-k COMPONENTS [COMPONENTS ...]] [-n N_ITER] [--total-workers TOTAL_WORKERS] [--seed SEED] [--genes-file GENES_FILE] [--numgenes NUMGENES] [--tpm TPM] [--beta-loss {frobenius,kullback-leibler,itakura-saito}] [--init {random,nndsvd}] [--densify] [--worker-index WORKER_INDEX] [--local-density-threshold LOCAL_DENSITY_THRESHOLD] [--local-neighborhood-size LOCAL_NEIGHBORHOOD_SIZE] [--show-clustering] {prepare,factorize,combine,consensus,k_selection_plot} cnmf: error: argument --worker-index: invalid int value: '{}'

dylkot commented 1 month ago

Hi @jaewoomo hmm I'm not totally sure. Have you install cnmf using pip and installed parallel using conda? You also need to include --total-worker 4 as an argument (sorry this is currently missing in the tutorial). If so a command like below should work:

parallel cnmf factorize --output-dir example_PBMC/cNMF --name pbmc_cNMF --total-worker 4 --worker-index {} ::: 0 1 2 3