timothyfrasier / related

8 stars 7 forks source link

Program stoped in subroutine StopOnDataError #19

Open cristianP04 opened 8 months ago

cristianP04 commented 8 months ago

Hi Timoty,

I have a dataset of 730 individuals from 13 populations which were genotyped for 13 loci (I do have some missing data). populations have different sample sizes (from 7 individuals up to 260).

I subdivided my complete dataset into 13 sub-datasets corresponding to the 13 populations to calculate the TrioML coefficient of inbreeding for each population using the coancestry function, please see below:

listoffile <- list.files(path = ".", pattern = ".txt", full.names = T) pops <- lapply(listoffile, readgenotypedata)

inbreed_coeff <- vector("list", length(pops))

for (i in 1:length(pops)) { coancestry_results <- coancestry(genotype.data = pops[[i]]$gdata, trioml = 2, ci95.num.bootstrap = 100L, allow.inbreeding = F, output.file = FALSE) inbreed_coeff[[i]] <- coancestry_results$inbreeding }

However, I am getting the following error:

The observed # alleles at each locus should be in range [1,127]! Errors in DATA. Insufficient data or incorrect format. Please check DATA and format and then re-run the program Program stoped in subroutine StopOnDataError Error in system.time(.Fortran("related", PACKAGE = "related")) : Related encountered a fatal error. Timing stopped at: 0.78 1.222 2.034

I tried to run the function coancestry for all the populations separately and the error is given just for two populations that happen to have missing data for all the samples in one locus (the last one).

How can I fix this?

Thank you in advance for your help

timothyfrasier commented 7 months ago

Hi Cristian:

I'm sorry that you are having this problem. Can you send my one of your input files (that is giving you this error) and I can see if anything jumps out?

-Tim


Timothy R. Frasier Coordinator: Forensic Sciences Program Professor: Biology Saint Mary's University 923 Robie Street Halifax, Nova Scotia B3H 3C3 Canada Tel: (902) 491-6382 E-mail: @.*** frasierlab.ca


From: cristianP04 @.> Sent: Thursday, November 30, 2023 1:27 PM To: timothyfrasier/related @.> Cc: Subscribed @.***> Subject: [timothyfrasier/related] Program stoped in subroutine StopOnDataError (Issue #19)

Hi Timoty,

I have a dataset of 730 individuals from 13 populations which were genotyped for 13 loci (I do have some missing data). populations have different sample sizes (from 7 individuals up to 260).

I subdivided my complete dataset into 13 sub-datasets corresponding to the 13 populations to calculate the TrioML coefficient of inbreeding for each population using the coancestry function, please see below:

list and read files paths

listoffile <- list.files(path = ".", pattern = ".txt", full.names = T) pops <- lapply(listoffile, readgenotypedata) pops <- pops[c(7,10)]

Initiate an empty variable the same lenght as pops vector

inbreed_coeff <- vector("list", length(pops))

for (i in 1:length(pops)) {

Calculate the TrioML coefficient of inbreeding for each individuals

coancestry_results <- coancestry(genotype.data = pops[[i]]$gdata, trioml = 2, ci95.num.bootstrap = 100L, allow.inbreeding = F, output.file = FALSE)

Report inbreeding results for each population instead of the NULL

inbreed_coeff[[i]] <- coancestry_results$inbreeding }

However, I am getting the following error:

The observed # alleles at each locus should be in range [1,127]! Errors in DATA. Insufficient data or incorrect format. Please check DATA and format and then re-run the program Program stoped in subroutine StopOnDataError Error in system.time(.Fortran("related", PACKAGE = "related")) : Related encountered a fatal error. Timing stopped at: 0.78 1.222 2.034

I tried to run the function coancestry for all the populations separately and the error is given just for two populations that happen to have missing data for all the samples in one locus (the last one).

How can I fix this?

Thank you in advance for your help

— Reply to this email directly, view it on GitHubhttps://github.com/timothyfrasier/related/issues/19, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA3NYWPAOFXLSQH6XJ4UNDLYHC6WXAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYTSMJVHE2TAMA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

cristianP04 commented 7 months ago

Hi Timothy,

Thank you for getting back to me.

Please find attached one of the problematic sub datasets I created.

Cristian

On Tue, 5 Dec 2023 at 15:48, timothyfrasier @.***> wrote:

Hi Cristian:

I'm sorry that you are having this problem. Can you send my one of your input files (that is giving you this error) and I can see if anything jumps out?

-Tim


Timothy R. Frasier Coordinator: Forensic Sciences Program Professor: Biology Saint Mary's University 923 Robie Street Halifax, Nova Scotia B3H 3C3 Canada Tel: (902) 491-6382 E-mail: @.*** frasierlab.ca


From: cristianP04 @.> Sent: Thursday, November 30, 2023 1:27 PM To: timothyfrasier/related @.> Cc: Subscribed @.***> Subject: [timothyfrasier/related] Program stoped in subroutine StopOnDataError (Issue #19)

Hi Timoty,

I have a dataset of 730 individuals from 13 populations which were genotyped for 13 loci (I do have some missing data). populations have different sample sizes (from 7 individuals up to 260).

I subdivided my complete dataset into 13 sub-datasets corresponding to the 13 populations to calculate the TrioML coefficient of inbreeding for each population using the coancestry function, please see below:

list and read files paths

listoffile <- list.files(path = ".", pattern = ".txt", full.names = T) pops <- lapply(listoffile, readgenotypedata) pops <- pops[c(7,10)]

Initiate an empty variable the same lenght as pops vector

inbreed_coeff <- vector("list", length(pops))

for (i in 1:length(pops)) {

Calculate the TrioML coefficient of inbreeding for each individuals

coancestry_results <- coancestry(genotype.data = pops[[i]]$gdata, trioml = 2, ci95.num.bootstrap = 100L, allow.inbreeding = F, output.file = FALSE)

Report inbreeding results for each population instead of the NULL

inbreed_coeff[[i]] <- coancestry_results$inbreeding }

However, I am getting the following error:

The observed # alleles at each locus should be in range [1,127]! Errors in DATA. Insufficient data or incorrect format. Please check DATA and format and then re-run the program Program stoped in subroutine StopOnDataError Error in system.time(.Fortran("related", PACKAGE = "related")) : Related encountered a fatal error. Timing stopped at: 0.78 1.222 2.034

I tried to run the function coancestry for all the populations separately and the error is given just for two populations that happen to have missing data for all the samples in one locus (the last one).

How can I fix this?

Thank you in advance for your help

— Reply to this email directly, view it on GitHub< https://github.com/timothyfrasier/related/issues/19>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AA3NYWPAOFXLSQH6XJ4UNDLYHC6WXAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYTSMJVHE2TAMA>.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/timothyfrasier/related/issues/19#issuecomment-1841061594, or unsubscribe https://github.com/notifications/unsubscribe-auth/BELVERH4NLCWAPY2FJ2BBYTYH463XAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBRGA3DCNJZGQ . You are receiving this because you authored the thread.Message ID: @.***>

20585_GamC 166 166 169 173 194 194 205 209 167 167 251 255 129 133 152 156 180 180 125 129 201 205 232 240 000 000 20591_GamC 152 166 169 169 194 194 205 205 167 171 255 255 129 133 156 156 180 188 125 129 205 205 232 232 000 000 20594_GamC 152 166 173 173 194 194 201 201 167 167 251 255 129 129 156 156 180 188 125 129 201 205 232 240 000 000 20595_GamC 162 166 169 169 194 194 209 209 171 171 000 000 129 133 156 156 180 184 125 129 205 205 232 232 000 000 20599_GamC 152 166 169 169 194 194 205 205 167 171 251 255 129 133 156 156 184 184 125 129 201 205 232 240 000 000 20600_GamC 166 166 169 169 194 194 201 209 167 171 251 255 129 133 156 156 184 184 125 129 000 000 232 232 000 000 20508_GamC 166 166 169 173 194 194 205 205 171 179 251 255 129 133 148 156 180 188 125 125 205 205 240 244 000 000 20512_GamC 166 166 169 169 194 194 205 209 171 179 255 255 129 129 148 156 180 184 125 125 205 205 236 236 000 000

timothyfrasier commented 7 months ago

Hi Cristian:

My apologies, but I don't see a file attached. Can you send it again, and I should be able to look into it soon.

-Tim


Timothy R. Frasier Coordinator: Forensic Sciences Program Professor: Biology Saint Mary's University 923 Robie Street Halifax, Nova Scotia B3H 3C3 Canada Tel: (902) 491-6382 E-mail: @.*** frasierlab.ca


From: cristianP04 @.> Sent: Wednesday, December 6, 2023 10:11 AM To: timothyfrasier/related @.> Cc: Timothy Frasier @.>; Comment @.> Subject: Re: [timothyfrasier/related] Program stoped in subroutine StopOnDataError (Issue #19)

Hi Timothy,

Thank you for getting back to me.

Please find attached one of the problematic sub datasets I created.

Cristian

On Tue, 5 Dec 2023 at 15:48, timothyfrasier @.***> wrote:

Hi Cristian:

I'm sorry that you are having this problem. Can you send my one of your input files (that is giving you this error) and I can see if anything jumps out?

-Tim


Timothy R. Frasier Coordinator: Forensic Sciences Program Professor: Biology Saint Mary's University 923 Robie Street Halifax, Nova Scotia B3H 3C3 Canada Tel: (902) 491-6382 E-mail: @.*** frasierlab.ca


From: cristianP04 @.> Sent: Thursday, November 30, 2023 1:27 PM To: timothyfrasier/related @.> Cc: Subscribed @.***> Subject: [timothyfrasier/related] Program stoped in subroutine StopOnDataError (Issue #19)

Hi Timoty,

I have a dataset of 730 individuals from 13 populations which were genotyped for 13 loci (I do have some missing data). populations have different sample sizes (from 7 individuals up to 260).

I subdivided my complete dataset into 13 sub-datasets corresponding to the 13 populations to calculate the TrioML coefficient of inbreeding for each population using the coancestry function, please see below:

list and read files paths

listoffile <- list.files(path = ".", pattern = ".txt", full.names = T) pops <- lapply(listoffile, readgenotypedata) pops <- pops[c(7,10)]

Initiate an empty variable the same lenght as pops vector

inbreed_coeff <- vector("list", length(pops))

for (i in 1:length(pops)) {

Calculate the TrioML coefficient of inbreeding for each individuals

coancestry_results <- coancestry(genotype.data = pops[[i]]$gdata, trioml = 2, ci95.num.bootstrap = 100L, allow.inbreeding = F, output.file = FALSE)

Report inbreeding results for each population instead of the NULL

inbreed_coeff[[i]] <- coancestry_results$inbreeding }

However, I am getting the following error:

The observed # alleles at each locus should be in range [1,127]! Errors in DATA. Insufficient data or incorrect format. Please check DATA and format and then re-run the program Program stoped in subroutine StopOnDataError Error in system.time(.Fortran("related", PACKAGE = "related")) : Related encountered a fatal error. Timing stopped at: 0.78 1.222 2.034

I tried to run the function coancestry for all the populations separately and the error is given just for two populations that happen to have missing data for all the samples in one locus (the last one).

How can I fix this?

Thank you in advance for your help

— Reply to this email directly, view it on GitHub< https://github.com/timothyfrasier/related/issues/19>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AA3NYWPAOFXLSQH6XJ4UNDLYHC6WXAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYTSMJVHE2TAMA>.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/timothyfrasier/related/issues/19#issuecomment-1841061594, or unsubscribe https://github.com/notifications/unsubscribe-auth/BELVERH4NLCWAPY2FJ2BBYTYH463XAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBRGA3DCNJZGQ . You are receiving this because you authored the thread.Message ID: @.***>

20585_GamC 166 166 169 173 194 194 205 209 167 167 251 255 129 133 152 156 180 180 125 129 201 205 232 240 000 000 20591_GamC 152 166 169 169 194 194 205 205 167 171 255 255 129 133 156 156 180 188 125 129 205 205 232 232 000 000 20594_GamC 152 166 173 173 194 194 201 201 167 167 251 255 129 129 156 156 180 188 125 129 201 205 232 240 000 000 20595_GamC 162 166 169 169 194 194 209 209 171 171 000 000 129 133 156 156 180 184 125 129 205 205 232 232 000 000 20599_GamC 152 166 169 169 194 194 205 205 167 171 251 255 129 133 156 156 184 184 125 129 201 205 232 240 000 000 20600_GamC 166 166 169 169 194 194 201 209 167 171 251 255 129 133 156 156 184 184 125 129 000 000 232 232 000 000 20508_GamC 166 166 169 173 194 194 205 205 171 179 251 255 129 133 148 156 180 188 125 125 205 205 240 244 000 000 20512_GamC 166 166 169 169 194 194 205 209 171 179 255 255 129 129 148 156 180 184 125 125 205 205 236 236 000 000

— Reply to this email directly, view it on GitHubhttps://github.com/timothyfrasier/related/issues/19#issuecomment-1842965384, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA3NYWIEIKRSRTKCYFZZYJLYIB4IJAVCNFSM6AAAAABABR3CCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBSHE3DKMZYGQ. You are receiving this because you commented.Message ID: @.***>