philippdre / omniCLIP

omniCLIP is a CLIP-Seq peak caller
GNU General Public License v3.0
15 stars 9 forks source link

several bugs in run_omniCLIP #27

Open vagarwal87 opened 3 years ago

vagarwal87 commented 3 years ago

A few bugs for people to be aware of, since the Viterbi can take days and then program terminates unsuccessfully, which can be a bit frustrating:

1) There isn't any description that the db file must end in ".db", so my GeneAnnotation was never set up:

if args.gene_anno_file.split('.')[-1] == 'db': GeneAnnotation = gffutils.FeatureDB(args.gene_anno_file, keep_order=True)

I ended up just commenting out the "if" statement and it worked fine. Or you can write in README that it must end in .db

2) I added the following lines:

if not os.path.exists(EmissionParameters['out_dir']):   <------ added
    os.makedirs(EmissionParameters['out_dir'])           <------ added

OutFile = os.path.join(EmissionParameters['out_dir'],
                       EmissionParameters['out_file_base'] + '.txt')

because if the output directory doesn't already exist, it would terminate w/ error.

3) --nb-cores > 1 doesn't work for some reason though my hardware has 8 cores...for now just using 1. Possibly this is something weird w/ multiprocessing module & my system, though.

4) it would be helpful in general to describe whether this package is intended for ALL CLIP? I thought so initially, but one of the open comments suggests that you're still evaluating T->C mutations w/ PAR-CLIP. Is it okay to use this with iCLIP/eCLIP/HITS-CLIP or is it always using mutations to evaluate significance?