Ax-Sch / AlphScore

Improving pathogenicity prediction of missense variants by using AlphaFold-derived features
GNU General Public License v3.0
6 stars 4 forks source link

Error in str_split #4

Open sah4030 opened 2 months ago

sah4030 commented 2 months ago

When running this command Rscript workflow/scripts/preprocess.R -i gnomad_extracted.csv.gz -o gnomad_extracted_prepro.csv.gz

I got this error message: Error in str_split(n_occur[n_occur$Freq > 1, ]$Var1, " ", simplify = TRUE)[, : subscript out of bounds

Ax-Sch commented 1 month ago

Hi, Thanks for the feedback. However, I can not replicate the error you got. When I run the preprocess.R script, it finishes without problems.

I run it as follows: navigate to the directory of the cloned repository. conda activate AlphScore Rscript workflow/scripts/preprocess.R -i results/train_testset1/gnomad_extracted.csv.gz -o gnomad_extracted_prepro.csv.gz

The error you are referring to is probably produced by this lilne: double_ids<-str_split(n_occur[n_occur$Freq > 1,]$Var1, " ", simplify=TRUE)[,1] In your code stated above the "1]" is missing at the end - maybe this is an issue?

You could try to recreate the AlphScore environment with the file https://github.com/Ax-Sch/AlphScore/blob/main/workflow/envs/AlphScore_exact_versions_for_documentation.yaml Otherwise please provide more context (which system/ environments / versions are you using?).

Best, Axel