hringbauer / hapROH

Software to call ROHs
GNU General Public License v3.0
15 stars 3 forks source link

Different result in v0.1 and v0.3? #2

Closed ryhui closed 3 years ago

ryhui commented 3 years ago

Hi Harald,

I notice the current version of hapROH gives very different result compared to the last version I used (v0.1a4) on the same data. I checked that the hap.csv and map.csv files are identical, but posterior0.csv and roh.csv are not. v0.3 is somehow tagging many more ROH than v0.1 - I got > 3000cM ROH tracks for everybody, which is almost the entire genome. This happens in both haploid and diploid mode.

It could also be user error on my side. I followed the parameters from the notebook:

hapsb_ind(iid=ind, chs=range(1, 23),
                  path_targets=infile, 
                  h5_path1000g='roh/all1240/chr',
                  meta_path_ref='roh/all1240/meta_df_all.csv',
                  folder_out='./{}/'.format(outdir), prefix_out='',
                  e_model='haploid', p_model='EigenstratUnpacked',
                  post_model='Standard', processes=nthreads, delete=False, output=True, save=True,
                  save_fp=False, n_ref=2504, exclude_pops=[], readcounts=True, random_allele=True,
                  roh_in=1, roh_out=20, roh_jump=300, e_rate=0.01, e_rate_ref=0.0,
                  cutoff_post=0.999, max_gap=0, roh_min_l=0.01,
                  logfile=False, combine=True, file_result='_roh_full.csv')

Should I change anything? Also happy to share my input and output files if needed.

Best, Ruoyun

hringbauer commented 3 years ago

I believe your readcounts=True is problematic. You don't actually use read count data (available via hdf5 target format), so try setting that parameter to False. I did some updates to the readcount mode that make it behave correctly now, which in your case produce that error.

Otherwise the parameters look good.

Side note: v0.1a4 uses a faster HMM (a rescaled version not doing the log transform) and posterior0 is now saved not log-transformed. Every new run should do that automatically.

ryhui commented 3 years ago

Thanks, that solved the problem!

Ruoyun