Open suyanxun opened 4 years ago
Hi,
Please check that the biomart object you are using is indeed properly loaded. You can double check this by accessing hb$genes
in the tutorial.
Please check that the gene names consistent with the gene names used in your expression matrix.
Best, Jean
Dear Jean, I had the same problem running the tutorial, as suyanxun reported.
> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL
As you suggested, I checked the hb$genes slot and the gene names in the matrix of the example and everything seems fine.
> print(gexp[1:5,1:5])
MGH31_A02 MGH31_A03 MGH31_A04 MGH31_A05 MGH31_A06
AAAS 3.897660 -4.5248935 -4.183228 2.7093435 3.5550152
AADAT 2.166325 1.5769039 3.300534 2.1134482 0.8006273
AAGAB -4.193448 0.4587702 -4.232739 -2.6151893 -2.4645792
AAK1 2.360924 2.6094029 2.830466 2.1001597 2.1811283
AAMP 3.484948 2.2440590 3.007798 0.4639664 1.8083660
> head(hb$genes)
GRanges object with 6 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
AAAS chr12 53307456-53324864 *
AADAT chr4 170060222-170091699 *
AAGAB chr15 67200667-67255195 *
AAK1 chr2 69457997-69674349 *
AAMP chr2 218264123-218270257 *
AAR2 chr20 36236459-36270918 *
-------
seqinfo: 69 sequences from an unspecified genome; no seqlengths
how can we solve this issue in the tutorial? I was just checking each step of the pipeline before perform the analysis on my own datatset.
Thanks
Hi,
Thanks for the bug report. This may have to do with biomart or a related dependency being updated.
To help me with the debugging, could you please let me know if the previous steps in the tutorial:
hb <- new('HoneyBADGER', name='MGH31')
hb$setGexpMats(gexp, ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE)
hb$plotGexpProfile() ## initial visualization
hb$setMvFit(verbose=TRUE) ## model variance
hb$setGexpDev(verbose=TRUE) ## model necessary expression deviation to identify CNVs
All worked properly? (e.g. without error or warning)
Also, after running calcGexpCnvBoundaries
, is there any output provided by:
## double check what CNVs were identified
bgf <- hb$bound.genes.final
genes <- hb$genes
regions.genes <- range(genes[unlist(bgf)])
print(regions.genes)
Or is the output null?
Thanks for the help, Jean
On Jan 10, 2020, at 8:21 AM, kylandra notifications@github.com<mailto:notifications@github.com> wrote:
Dear Jean, I had the same problem running the tutorial, as suyanxun reported.
hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE) Loading required package: rjags Loading required package: coda Linked to JAGS 4.3.0 Loaded modules: basemod,bugs ERROR: Error: subscript contains invalid names NULL
As you suggested, I checked the hb$genes slot and the gene names in the matrix of the example and everything seems fine.
print(gexp[1:5,1:5]) MGH31_A02 MGH31_A03 MGH31_A04 MGH31_A05 MGH31_A06 AAAS 3.897660 -4.5248935 -4.183228 2.7093435 3.5550152 AADAT 2.166325 1.5769039 3.300534 2.1134482 0.8006273 AAGAB -4.193448 0.4587702 -4.232739 -2.6151893 -2.4645792 AAK1 2.360924 2.6094029 2.830466 2.1001597 2.1811283 AAMP 3.484948 2.2440590 3.007798 0.4639664 1.8083660
head(hb$genes) GRanges object with 6 ranges and 0 metadata columns: seqnames ranges strand
AAAS chr12 53307456-53324864 * AADAT chr4 170060222-170091699 * AAGAB chr15 67200667-67255195 * AAK1 chr2 69457997-69674349 * AAMP chr2 218264123-218270257 * AAR2 chr20 36236459-36270918 * ------- seqinfo: 69 sequences from an unspecified genome; no seqlengths
how can we solve this issue in the tutorial? I was just checking each step of the pipeline before perform the analysis on my own datatset.
Thanks
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JEFworks_HoneyBADGER_issues_34-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DACEPMSBJ3K4GC4SASV3U3ZTQ5BY5RA5CNFSM4IZCG34KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIT4R6Q-23issuecomment-2D573032698&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=sacPl6ehGsUf0W5tx7MFmwWS2ZNsqqYsq0zGdb6UUzk&s=SMKV3AkrFdkOYuy4gsVmOSPCQLQbB81Gjhwj0prZ7yU&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACEPMSB7VFCPCQSTSHFMKG3Q5BY5RANCNFSM4IZCG34A&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=sacPl6ehGsUf0W5tx7MFmwWS2ZNsqqYsq0zGdb6UUzk&s=bU5ielDim2m8jcN1zBjn8YTmpQ_2bXB4zBiC8GwE7xU&e=.
Hi Jean,
the version of biomart I'm using is biomaRt_2.42.0, if this can help you.
The first steps worked with no problem:
> hb <- new('HoneyBADGER', name='MGH31')
> hb$setGexpMats(gexp, ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE)
Initializing expression matrices ...
Normalizing gene expression for 6082 genes and 75 cells ...
Cache found
Done setting initial expression matrices!
> hb$plotGexpProfile()
> hb$setMvFit(verbose=TRUE)
Modeling expected variance ... Done!
> hb$setGexpDev(verbose=TRUE)
Optimal deviance: 1.015797
Then I get the error:
hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL
But if I double check what CNVs were identified, I obtain this result:
> bgf <- hb$bound.genes.final
> genes <- hb$genes
> regions.genes <- range(genes[unlist(bgf)])
> print(regions.genes)
GRanges object with 5 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr11 167784-2397419 *
[2] chr7 816615-158956747 *
[3] chr9 137241287-138124624 *
[4] chr10 274190-133373354 *
[5] chr5 36876769-179634784 *
-------
seqinfo: 69 sequences from an unspecified genome; no seqlengths
So is it working?
I continued with the tutorial, and I got this error:
> hb$retestIdentifiedCnvs(retestBoundGenes = TRUE, retestBoundSnps = FALSE, verbose=FALSE)
> results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE)
> print(head(results[,1:5]))
> trees <- hb$visualizeResults(geneBased=TRUE, alleleBased=FALSE, details=TRUE, margins=c(25,15))
> hc <- trees$hc
> order <- hc$labels[hc$order]
> hb$plotGexpProfile(cellOrder=order)
Error in d[, cellOrder] : index out of bounds
and checking order vector I obtain:
> head(order)
[1] "del.gexp.MGH31_E09" "del.gexp.MGH31_A08" "del.gexp.MGH31_H07" "del.gexp.MGH31_G04"
[5] "del.gexp.MGH31_F10" "del.gexp.MGH31_F01"
what is going on?
thanks Oriana
Dear Jean,
I have exactly the same problem. Did you finally solve this issue?
Thanks
Aurélie
@kylandra @aureliebousard @JEFworks
Hi everyone!
In regards to the index out of bounds error, I tried the following:
Error in d[, cellOrder] : index out of bounds
I'm thinking the cellOrder
is looking for integers, not strings. So I removed hc$labels
.
order <- hc$labels[hc$order]
changed to order = hc$order
The following code gave me the correct graph similar to the getting started tutorial.
> hb$setMvFit(verbose=TRUE)
> hb$setGexpDev(verbose=TRUE)
> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
> order <- hc$order
> hb$plotGexpProfile(cellOrder=hc$order)
I hope this helps. This allowed me to finish the getting started tutorial.
Jesus
I have the same issue. If continue, it halts at :
results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 5, 4
@tinyi I had the same error, I found that the error of differing number of rows: 5, 4 is caused by mismatching row number of rgs and amp.gexp.prob. May need to get the same rows to bind them. Good luck!
Qiu
@suyanxun @aureliebousard @kylandra - you may have to specify a different host for the biomart.
The following is taken from the setGexpMats
help and works fine.
mart.obj <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = 'hsapiens_gene_ensembl',
host = "jul2015.archive.ensembl.org")
R
What's wrong when I run the calcGexpCnvBoundaries using example data like https://jef.works/HoneyBADGER/Getting_Started.html?