Hello, I have been having trouble making a viable BED file from R that can be recognized by bedtools for sorting.
Error Output: Unexpected file format. Please use tab-delimited BED, GFF, or VCF. Perhaps you have non-integer starts or ends at line 2
I am not sure if I can freely share the file as it contains data from a private database. The file had initial fields (#Chr, start, stop, TargetID). Target ID contains gene names (i.e. APOE) with the other fields having the corresponding information (Chr written as integer). After that, there are over 100 columns containing sample data (expression in decimal notation). The bed file was written with the following R code (On Windows):. As a safety measure, I applied dos2unix to the bed file before using bedtools.
While bedtools is not recognizing my file, I am able to index it using tabix which is required for fastQTL (https://github.com/francois-a/fastqtl). However, I am unable to read the tabix file with the following error:
Failed to open file "filename.bed.gz.tbi" : Exec format error
Couldn't understand format of "filename.bed.gz.tbi"
The bad tabix format makes in unable for me to use fastQTL, as I get the following error for all chunks used:
Failed to get region 9:37753805-107690518 in [filename.bed.gz]
Coming back to bedtools, it seems to recognize the example file given with the fastQTL repository (examples folder: phenotypes.bed.gz) for the same sort function. As such, I am using bedtools as a type of testing mechanism to see if I am making a valid BED file.
Based on how I have prepared my BED file, are there any issues that would result in improper formatting? Thank you in advance, and please ask if you need any more info. I could probably share the expression data, but I need to check the guidelines of the repository before I do so (AMP-AD consortium).
I did try to switch over to tensorQTL to see if the formatting wasn't as big of an issue with that program, but I am unable to download it from the repository onto my cluster.
Fixed it. For future reference, use dos2unix and then sed 's/ +//g' to reformat Windows newlines to Linux and remove unnecessary whitespace, respectively.
Hello, I have been having trouble making a viable BED file from R that can be recognized by bedtools for sorting.
Error Output: Unexpected file format. Please use tab-delimited BED, GFF, or VCF. Perhaps you have non-integer starts or ends at line 2
I am not sure if I can freely share the file as it contains data from a private database. The file had initial fields (#Chr, start, stop, TargetID). Target ID contains gene names (i.e. APOE) with the other fields having the corresponding information (Chr written as integer). After that, there are over 100 columns containing sample data (expression in decimal notation). The bed file was written with the following R code (On Windows):. As a safety measure, I applied dos2unix to the bed file before using bedtools.
write.table(object, file = "filename", quote = F, sep = "\t", row.names = F, col.names = T)
While bedtools is not recognizing my file, I am able to index it using tabix which is required for fastQTL (https://github.com/francois-a/fastqtl). However, I am unable to read the tabix file with the following error:
Failed to open file "filename.bed.gz.tbi" : Exec format error Couldn't understand format of "filename.bed.gz.tbi"
The bad tabix format makes in unable for me to use fastQTL, as I get the following error for all chunks used: Failed to get region 9:37753805-107690518 in [filename.bed.gz]
Coming back to bedtools, it seems to recognize the example file given with the fastQTL repository (examples folder: phenotypes.bed.gz) for the same sort function. As such, I am using bedtools as a type of testing mechanism to see if I am making a valid BED file.
Based on how I have prepared my BED file, are there any issues that would result in improper formatting? Thank you in advance, and please ask if you need any more info. I could probably share the expression data, but I need to check the guidelines of the repository before I do so (AMP-AD consortium).
I did try to switch over to tensorQTL to see if the formatting wasn't as big of an issue with that program, but I am unable to download it from the repository onto my cluster.