clindet / anor

anor: an annotation and visualization system based on R and Shiny framework
https://jhuanglab.github.io/anor/
Other
32 stars 14 forks source link

known data.table fread limitation #1

Open Miachol opened 6 years ago

Miachol commented 6 years ago

In fread function, the skip parameter can't to input > 2500000000. If the database file > 2500000000 lines, you need to split the raw database file.

For example:

/usr/bin/split -l 2499999999 hg19_eigen.txt hg19_eigen.txt_split
# if you have been write 2499999999 in sqlite file, you can start from "ab"
for( i in c("aa", "ab", "ac", "ad")) {
  system(sprintf("mv hg19_eigen.txt_split%s hg19_eigen.txt", i))
  new.colnames <- c("#Chr", "Start", "End", "Ref", "Alt", "Eigen")
  annovarR::sqlite.auto.build('eigen', database.dir = './', append = TRUE, new.colnames = new.colnames)
  system(sprintf("mv hg19_eigen.txt hg19_eigen.txt_split%s", i))
}