Bioconductor / BSgenome

Software infrastructure for efficient representation of full genomes and their SNPs
https://bioconductor.org/packages/BSgenome
7 stars 9 forks source link

forgeBSgenomeDataPkg erro in Getting chrom info #70

Closed ld9866 closed 1 year ago

ld9866 commented 1 year ago

Dear developer: We newly assembled a pig genome and tried to do some specific ATAC data analysis, and now we are trying to build the genome to meet ArchR's genome file and annotation file. We prepared the seed file and the result showed that: I think this may be due to the Chrom file being read from USCS, how should we prepare this file and let him read from the local? Best regards

Creating package in C:/atac/BSgenome.Sscrofa.UCSC.susScr11 Copying 'C:/R_tmp/susScr11.2bit' to 'C:/atac/BSgenome.Sscrofa.UCSC.susScr11/inst/extdata/single_sequences.2bit' ... DONE Getting chrom info from UCSC with 'getChromInfoFromUCSC("susScr11")' ... Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 3 elements : Warning messages: 1: In forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, destdir = destdir, : field 'provider_version' is deprecated in favor of 'genome' 2: In forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, destdir = destdir, : field 'release_name' is deprecated 3: In (function (file, header = FALSE, sep = "", quote = "\"'", dec = ".", : line 1 appears to contain embedded nulls 4: In (function (file, header = FALSE, sep = "", quote = "\"'", dec = ".", :

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 3 elements

Package: BSgenome.Sscrofa.UCSC.susScr11 Title: Full genome sequences for Sus scrofa (UCSC version susScr11) Description: Full genome sequences for Sus scrofa (Pig) as provided by UCSC (susScr11, Feb. 2017) and stored in Biostrings objects. Version: 1.4.2 organism: Sus scrofa common_name: Pig provider: UCSC provider_version: susScr11 release_date: Feb. 2017 release_name: Swine Genome Sequencing Consortium Sscrofa11.1 source_url: http://hgdownload.cse.ucsc.edu/goldenPath/susScr11/bigZips/ organism_biocview: Sus_scrofa BSgenomeObjname: Sscrofa circ_seqs: "chrM" SrcDataFiles: susScr11.2bit from http://hgdownload.cse.ucsc.edu/goldenPath/susScr11/bigZips/ PkgExamples: genome$chr1 # same as genome[["chr1"]] seqs_srcdir: C:/R_tmp seqfile_name: susScr11.2bit

hpages commented 1 year ago

I don't understand why you are trying to forge BSgenome.Sscrofa.UCSC.susScr11 when this package is already available in Bioconductor?

ld9866 commented 1 year ago

In fact, we newly assembled a pig genome different from “Sscrofa.UCSC.susScr11”. I am trying to build BSgenome needed files using our own genome which had some problems, so I used the "BSgenome. Sscrofa. UCSC. SusScr11" seed and related files to test, and debug errors. Unfortunately, we were unable to complete the build. The Error is the same as "Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :"

hpages commented 1 year ago

But you are showing the seed file for susScr11. How does that help? Why aren't you showing your seed file?

ld9866 commented 1 year ago

Dear hpages: In fact, the seed file we use is similar to the reference genome, so we want to copy the construction process of the reference genome first. If the process can be done well, it means that it is our own file preparation problem, which can reduce the trouble to some extent. After consideration, we think that the genome we constructed is very similar to the reference genome, and the reference genome may be more representative in the subsequent analysis, so we choose to use the reference genome for the follow-up analysis, and do not construct for the time being. Thank you for your help.