HOW to continue cnmf-factorize?

CoolCurl commented 1 year ago

If my process is killed, but it hasn't completed the iteration, part of files have saved, can i write some codes in cnmf.py to continue my work?

Raoufsk commented 5 months ago

I am also interested in this @CoolCurl , did you find a way?

dylkot commented 4 months ago

Hi, sorry this isn't an easy thing to address in the current implementation and it is something I will think about adding in the future. This is what I have done in the past. Basically I gather up the indices for the nodes that didn't complete by checking if their output file exists or not. Then I submit them separately as one iteration per node.

import os
def worker_filter(iterable, worker_index, total_workers):
    return (p for i,p in enumerate(iterable) if (i-worker_index)%total_workers==0)

def load_df_from_npz(filename):
    with np.load(filename, allow_pickle=True) as f:
        obj = pd.DataFrame(**f)
    return obj

# Identify the indeces for the missing jobs and store them in missing
missing = []
run_params = load_df_from_npz(cnmf_obj.paths['nmf_replicate_parameters'])
for worker_i in range(total_workers):
    jobs_for_this_worker = worker_filter(range(len(run_params)), worker_i, total_workers)
    for idx in jobs_for_this_worker:
        p = run_params.iloc[idx, :]
        outfn = cnmf_obj.paths['iter_spectra'] % (p['n_components'], p['iter'])
        if not os.path.exists(outfn):
            print(worker_i, outfn)
            missing.append(worker_i)

# Submit the individual missing jobs to a single node each.
basecmd = "export OMP_NUM_THREADS=6; cnmf factorize --name {name} --output-dir {outdir} --total-workers {tw} --worker-index {i}"
q = 'medium'

for i in missing:
    cmd = basecmd.format(name=name, outdir=cnmfdir, i=i, tw=total_workers)
    e = os.path.join(cnmfout, '{j}.{i}.err.txt').format(i=i, j=jname)
    o = os.path.join(cnmfout, '{j}.{i}.out.txt').format(i=i, j=jname)
    bsub_cmd = 'bsub -q {q} -J {j} -o {o} -e {e} "{cmd}"'.format(q=q, j=jname, e=e, o=o,  cmd=cmd)
    print(bsub_cmd)
    # Submit the job in jupyter notebook using !
    !{bsub_cmd}

dylkot / cNMF

HOW to continue cnmf-factorize? #61