junseonghwan / PhylExAnalysis

BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Segmentation fault when running simulation data #5

Closed Eric0627 closed 1 year ago

Eric0627 commented 1 year ago

Hello,

I simulated bulk.txt and sc.txt and used the provided script to generate hyperparameters. However, when I run RunInference.py with default parameters, it returns ./RunInference.sh: line 14: 22861 Segmentation fault ./run -c ${CONFIG_FILE}. My simulated input files are as follows:

bulk.txt

s0  chr22   25570551    T   C   6,6,6,6,6   184,184,184,184,184 2,2,2,2,2   2,2,2,2,2
s1  chr22   25600316    A   G   4,4,4,4,4   202,202,202,202,202 2,2,2,2,2   2,2,2,2,2
s2  chr22   25610924    G   C   13,13,13,13,13  219,219,219,219,219 2,2,2,2,2   2,2,2,2,2
s3  chr22   25640099    A   G   6,6,6,6,6   186,186,186,186,186 2,2,2,2,2   2,2,2,2,2
s4  chr22   25710536    T   G   11,11,11,11,11  188,188,188,188,188 2,2,2,2,2   2,2,2,2,2
s5  chr22   25730696    T   G   6,6,6,6,6   227,227,227,227,227 2,2,2,2,2   2,2,2,2,2
...
s251    chr22   33300040    G   C   10,10,10,10,10  191,191,191,191,191 2,2,2,2,2   2,2,2,2,2
s252    chr22   33360892    C   T   164,164,164,164,164 313,313,313,313,313 2,2,2,2,2   4,4,4,4,4
s253    chr22   33380380    G   A   227,227,227,227,227 339,339,339,339,339 2,2,2,2,2   6,6,6,6,6
s254    chr22   33430573    G   T   160,160,160,160,160 273,273,273,273,273 2,2,2,2,2   4,4,4,4,4
s255    chr22   33480850    T   C   130,130,130,130,130 263,263,263,263,263 2,2,2,2,2   3,3,3,3,3
s256    chr22   33490392    G   C   140,140,140,140,140 280,280,280,280,280 2,2,2,2,2   3,3,3,3,3
...

sc.txt

ID  Cell    a   d   loc
s0  c0  18  19  chr22:25570551
s0  c1  82  83  chr22:25570551
s0  c2  78  82  chr22:25570551
s1  c0  17  17  chr22:25600316
s1  c1  103 103 chr22:25600316
s1  c2  78  82  chr22:25600316
...
s251    c0  14  19  chr22:33300040
s251    c1  101 101 chr22:33300040
s251    c2  66  71  chr22:33300040
s252    c0  21  26  chr22:33360892
s252    c1  115 120 chr22:33360892
s252    c2  13  167 chr22:33360892
s253    c0  14  16  chr22:33380380
s253    c1  86  87  chr22:33380380
s253    c2  12  236 chr22:33380380
s254    c0  16  18  chr22:33430573
s254    c1  84  87  chr22:33430573
s254    c2  13  168 chr22:33430573
s255    c0  22  24  chr22:33480850
s255    c1  101 106 chr22:33480850
s255    c2  10  133 chr22:33480850
...

sc_hp.txt

ID alpha beta delta0
s0 3.5 49 0.5
s1 5 79 0.5
s2 5.33333333333333 69.6666666666667 0.5
s3 4 15 0.5
s4 4.66666666666667 60 0.5
s5 4 57 0.5
...
s251 6 41 0.5
s252 55.6666666666667 50.6666666666667 0.5
s253 114 14 0.5
s254 54.3333333333333 38.6666666666667 0.5
s255 44.3333333333333 45.3333333333333 0.5
s256 47.6666666666667 47.6666666666667 0.5
s257 18.6666666666667 43.6666666666667 0.5
s258 177.5 16.5 0.5
s259 47.3333333333333 38.3333333333333 0.5
s260 134.666666666667 40.6666666666667 0.5
...

Thanks for your help in advance.

junseonghwan commented 1 year ago

Thank you for reporting the problem. The bulk data doesn't seem to have any variability across generated regions. Can you send me the options you used to simulate the data?

Eric0627 commented 1 year ago

I didn't use the provided scripts to simulate the data. Here are the simulation data I use: sc_hp.txt bulk.txt sc.txt Can you help me figure out this problem?

junseonghwan commented 1 year ago

You will need to replace space with tab (\t) as the delimiter in your sc_hp.txt. Sorry for the confusion -- it seems the example code was also using space instead of tab.

I noticed that in your bulk.txt, there are some copy numbers where the major copy number < minor copy number. You will need to correct that as well.

I hope this helps. Please let me know if you run into any problems after fixing the above issues.

Eric0627 commented 1 year ago

Thanks a lot for your help. The problem has been solved.