kzkedzierska / sonics

SONiCS - Stutter mONte Carlo Simulation
MIT License
3 stars 0 forks source link

help #15

Closed zhangjinpengGithub closed 3 years ago

zhangjinpengGithub commented 3 years ago

This is the result of my sonics installation, I don't know why I have this problem, I hope to get your help.

2021-07-20 03:09:00,041 INFO Initiating simulation for sample Block Traceback (most recent call last): File "sonics", line 761, in main() File "sonics", line 757, in main sonics_run_options=sonics_run_options File "sonics", line 243, in run_sonics sonics_run_options File "sonics", line 431, in process_one_genotype options File "sonics.pyx", line 140, in sonics.monte_carlo min_sim = results_pd['lnL'].apply(lambda x: x[x > -999999].count()).sort_values().iloc[0] TypeError: sort_values() missing 1 required positional argument: 'by' The installation did not work as expected. Please check the error messages above to resolve installation issues. Please contact the author if you need further help.

aakrosh commented 3 years ago

Could you please tell us the version of python you are using?

zhangjinpengGithub commented 3 years ago

Could you please tell us the version of python you are using?

Thanks for your reply, the installation worked when I tried the following software versions: numpy=1.17.0 pandas=0.24.1 scipy=1.6.1 python=3.7. Now, I have another idea and I wonder if it is possible to achieve it with your software. In a SAM format file, it is known that there are many reads mapped to an STR locus. But due to the effect of "stutter", it can cause the synthesis of motif sequences with different number of repeats. For example, if (AT)10 occurs 8 times, (AT)11 occurs 6 times, (AT)12 occurs twice, and (AT)13 occurs once in a mapping, whether running your program python sonics "10|8;11|6;12|2;13|1" will predict the STR genotype more accurately. Thanks again for your answer!

aakrosh commented 3 years ago

I am glad the installation worked. SONiCS is designed for dense forward simulations of PCR and capture experimental conditions, and in those conditions, it should provide genotypes that are more accurate. Details can be found in the manuscript (https://academic.oup.com/bioinformatics/article/34/23/4115/5040316). So, if the SAM file in your specific case is from such an experiment, then we expect SONiCS to improve the genotyping.

zhangjinpengGithub commented 3 years ago

My experimental data is not selected from target capture, but whole genome sequencing or RAD-seq, and I would like to know if it is possible to improve the genotype with SONiCS under such conditions. Thanks again!

aakrosh commented 3 years ago

SONiCS is not the right tool for WGS and RAD-Seq datasets. I would suggest looking into HipSTR, GangSTR or RepeatSeq. Thanks.