adimitromanolakis / sim1000G

Simulation of rare and common variants based on 1000 genomes data
17 stars 1 forks source link

Se #2

Closed violafanfani closed 4 years ago

violafanfani commented 4 years ago

Hello, I am trying to run a simulation on the chr22 and I am repeatedly getting a segmentation fault error. I am not sure if I am doing something wrong or if there is a bug somewhere.

I am running an adapted version of the script here (https://cran.r-project.org/web/packages/sim1000G/vignettes/ExtractingRegionsForSimulation.html) using only the EUR population. The vcf file is the whole chr22 ( filtered by bcftool as described ). I am running it on a cluster on a single core with 92Gb of memory, but the same happened also with 128Gb memory and more cores.

I am fairly sure the code is working as I can run it with small region of chr22 ( as in the chr4 example you provide). The id_ped are the pedigrees of the EUR populations.

Is there anything I've missed from the documentation?

Thanks for the attention, I attach the log of the job in question.

Cheers Viola

This is the log of the job: ++++++++++++++++++

.....
Parsed with column specification:
cols(
  .default = col_character(),
  `#CHROM` = col_double(),
  POS = col_double(),
  QUAL = col_double()
)
See spec(...) for full column specifications.

 *** caught segfault ***
address 0x2b8d347b7220, cause 'memory not mapped'

Traceback:
 1: cor(dat)
 2: haplodata(haplomatrix[, polymorphic])
 3: startSimulation(vcf, subset = id_ped)
An irrecoverable exception occurred. R is aborting now ...
/bin/bash: line 1: 23945 Segmentation fault      (core dumped)
.....
adimitromanolakis commented 4 years ago

Hi Viola,

how many variants are present in the file? The current version of sim1000G is not possible to load more than a few thousand variants.

Apostolos

On Thu, 17 Oct 2019 at 12:06, violafanfani notifications@github.com wrote:

Hello, I am trying to run a simulation on the chr22 and I am repeatedly getting a segmentation fault error. I am not sure if I am doing something wrong or if there is a bug somewhere.

I am running an adapted version of the script here ( https://cran.r-project.org/web/packages/sim1000G/vignettes/ExtractingRegionsForSimulation.html) using only the EUR population. The vcf file is the whole chr22 ( filtered by bcftool as described ). I am running it on a cluster on a single core with 92Gb of memory, but the same happened also with 128Gb memory and more cores.

I am fairly sure the code is working as I can run it with small region of chr22 ( as in the chr4 example you provide). The id_ped are the pedigrees of the EUR populations.

Is there anything I've missed from the documentation?

Thanks for the attention, I attach the log of the job in question.

Cheers Viola

This is the log of the job: ++++++++++++++++++

..... Parsed with column specification: cols( .default = col_character(), #CHROM = col_double(), POS = col_double(), QUAL = col_double() ) See spec(...) for full column specifications.

caught segfault address 0x2b8d347b7220, cause 'memory not mapped'

Traceback: 1: cor(dat) 2: haplodata(haplomatrix[, polymorphic]) 3: startSimulation(vcf, subset = id_ped) An irrecoverable exception occurred. R is aborting now ... /bin/bash: line 1: 23945 Segmentation fault (core dumped) .....

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/adimitromanolakis/sim1000G/issues/2?email_source=notifications&email_token=AEGJVY2E4JMHTT3ZS77UKNLQPCEQDA5CNFSM4JB323V2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HSQI4UQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEGJVY2ND2JUUIDLF3IESPTQPCEQDANCNFSM4JB323VQ .

violafanfani commented 4 years ago

Hi Apostolos, Thank you for the quick reply. I was mistakenly using more than 300K. However I would definitively need more than 10K, do you know if that is possible? What is the limit due to? Thanks Viola

adimitromanolakis commented 4 years ago

Hi Viola,

it should be possible but it will require some more computer time and memory. We haven't tested it with more than ~15000 markers.

Thanks,

Apostolos

On Fri, 18 Oct 2019 at 06:29, violafanfani notifications@github.com wrote:

Hi Apostolos, Thank you for the quick reply. I was mistakenly using more than 300K. However I would definitively need more than 10K, do you know if that is possible? What is the limit due to? Thanks Viola

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adimitromanolakis/sim1000G/issues/2?email_source=notifications&email_token=AEGJVY75SBCSKBGWA4BWAYLQPGFZFA5CNFSM4JB323V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBTZQ6Q#issuecomment-543660154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEGJVYZ6TN3VS47HWXVCRH3QPGFZFANCNFSM4JB323VQ .