FunGeST / Palimpsest

An R package for studying mutational signatures and structural variant signatures along clonal evolution in cancer.
68 stars 19 forks source link

Error in annotating the data #7

Closed cedricvanm closed 6 years ago

cedricvanm commented 6 years ago

Hi, This package looks awesome, but I have difficulties running it from the beginning on.

If I use the example data, when running vcf<-preprocessInput_snv(input_data = mut_data, ensgene = ensgene, reference_genome = ref_genome)

I get the error :

Error in validObject(result) : invalid class “VRanges” object: 'strand' should be a 'factor' Rle with levels c("+", "-", "*")

And if I use my data, from one sample, loaded as a tsv, I get at the same step the error :

Error in .normargSEW0(start, "start") : 'start' must be a numeric vector (or NULL)

I however made sure that the position column is numeric (as do the 3 ones regarding read depht)

My session info :

R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=French_Belgium.1252 LC_CTYPE=French_Belgium.1252 LC_MONETARY=French_Belgium.1252 LC_NUMERIC=C
[5] LC_TIME=French_Belgium.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] compiler_3.5.1 tools_3.5.1 yaml_2.2.0

Any idea how to fix these 2 errors ? Thank you very much!

FunGeST commented 6 years ago

Hi! Thanks for reaching out. We were already working on fixing the error number 1 caused due to certain dependency package updates. I have now updated the function. Could you kindly retry installing the package with the most recent fixes? This should solve your first question. Regarding your error number 2, I would advise you to stick to the input file format as described on the README (https://github.com/FunGeST/Palimpsest/blob/master/README.md). The headers of the input are mandatory as described. Let us know if you have any other questions or need further assistance. Best regards,

cedricvanm commented 6 years ago

Thank you for your quick response! Error number 1 is indeed fixed with the package update.

For error 2, my tsv file has the mandatory headers and nothing more or less.

If I transform my tsv file in a Rdata file and load this rdata in the script, I get the same error as with the tsv file :

Error in .normargSEW0(start, "start") : 'start' must be a numeric vector (or NULL)

If I take your example file, transform it to a tsv and load it again to use it, it is not working, but the error is different :

in [<-.data.frame(*tmp*, , namesColsToAdd, value = c(NAcharacter, : replacement has 5372 rows, data has 1

So the issue is indeed the file format.

Any idea how to continue ? I generate my sequencing data outside R, in a java environment. The output of my variant filtering is saved as a tsv.

I really hope you can help me with this. Have a very good day,

Cedric

FunGeST commented 6 years ago

Hi Cedric, This seems to be the error regarding reading your input file. I am not completely aware of your method to transform files across different data types. But I can strongly recommend using something like the following command to read your file in R

mut_data <- read.delim(vcf_file,as.is=T,header=T)

Regards,

Jay

cedricvanm commented 6 years ago

Hi Jay, great, this worked fine to import the tsv files. read_tsv() or load() however do not work.

Thanks for the troubleshooting!

Cedric