Closed HenrikBengtsson closed 8 years ago
Another problem is that if none of the sequence names exists the isCompatibleWithBySeqNames()
reports TRUE. This is because is:
names <- getSeqNames(this, unique = unique)
namesO <- getSeqNames(other, unique = unique)
idxs <- match(namesO, names)
res <- all(diff(idxs) > 0, na.rm = TRUE)
if (!res) {
attr(res, "reason") <- "The ordering of sequence names does not match."
}
all idxs
are NA
, which makes diff(idxs) > 0
all NA
, which makes res == TRUE
.
Fix this.
Added assertion of compatibility before returning BWA index set. Now we at least get:
> is <- buildBwaIndexSet(fa)
[2016-01-06 17:17:34] Exception: None of the sequence names matches.
at #06. isCompatibleWithBySeqNames.SequenceContigsInterface(this, other,
...)
- isCompatibleWithBySeqNames.SequenceContigsInterface() is in environment 'aroma.seq'
at #05. isCompatibleWithBySeqNames(this, other, ...)
- isCompatibleWithBySeqNames() is in environment 'aroma.seq'
at #04. isCompatibleWith.FastaReferenceFile(this, res, mustWork = TRUE,
verbose = less(verbose, 50))
- isCompatibleWith.FastaReferenceFile() is in environment 'aroma.seq'
at #03. isCompatibleWith(this, res, mustWork = TRUE, verbose = less(verbose,
50))
- isCompatibleWith() is in environment 'aroma.core'
at #02. buildBwaIndexSet.FastaReferenceFile(fa)
- buildBwaIndexSet.FastaReferenceFile() is in environment 'aroma.seq'
at #01. buildBwaIndexSet(fa)
- buildBwaIndexSet() is in environment 'aroma.seq'
Error: None of the sequence names matches.
Next step is to have it identify the other BWA index set, iff it exists, or otherwise generate a compatible one.
It may happen that an incorrect pre-built BWA index set is picked up by
buildBwaIndexSet(fa)
, e.g.Note the "no_chr" tag. Indeed, the sequence names for the latter has no "chr" prefix;
Thus,
buildIndexSet(fa)
should assert than the result is compatible with the FASTA file, i.e. compare sequence names and sequence lengths. For this we need to implement a parse for the*.ann
file. We can then also output more info when printing the index set, e.g. (prototype):just as for the FASTA file.