Illumina / ExpansionHunter

A tool for estimating repeat sizes
Other
174 stars 53 forks source link

stoi error when running with large variant catalog #179

Open uguenke opened 1 year ago

uguenke commented 1 year ago

Hi,

I am trying to run EH on WGS cram files, in GRCh38, with a large variant catalog (more than 6000 loci). When I run it, I have this message and the process just stops:

/path/to/ExpansionHunter --reads /path/to/SAMPLE.cram --reference /path/to/reference/hg38.fa --variant-catalog /path/to/variant_catalog/variant_catalog_cgg_hg38.json --output-prefix SAMPLE 2023-05-24T10:05:14,[Starting ExpansionHunter v5.0.0] 2023-05-24T10:05:14,[Analyzing sample SAMPLE] 2023-05-24T10:05:14,[Initializing reference /path/to/reference/hg38.fa] 2023-05-24T10:05:14,[Loading variant catalog from disk /path/to/variant_catalog/variant_catalog_cgg_hg38.json 2023-05-24T10:05:14,[stoi]

I tried to change the mode to streaming, add more threads, the error is the same. When I split the catalog, I can run smaller catalogs (around 60 loci), but beyond I get the same message.

Is there something to do? I work on a shared computing server.

Thank you very much for your help

Kevin

dwill023 commented 3 months ago

Yeah I'm getting the same error [stoi] but my json file only has 20 locusid. I've tried with subsetting the json to 6 and only 1 of the locusids and it works but when I try to run it on all 20 I get the error.

I've attached my json I've tried combing though it but can't see any issues.

Please advise codis_strs.json

dwill023 commented 3 months ago

I've managed to solve the issue, one Locusid (below) has the last ReferenceRegion incorrect, there's an extra digit "8" in the beginning I was able to run the entire json once I fixed this.

{
    "LocusId": "D2S1338",
    "LocusStructure": "(GGAA)*GGAC(GGAA)*(GGCA)*",
    "ReferenceRegion": [
        "2:218014858-218014866",
        "2:218014870-218014922",
        "2:2180148922-218014950"
    ],
    "VariantType": [
        "Repeat",
        "Repeat",
        "Repeat"
    ]  
  }