Hmm, perhaps I should not be using ReferenceUtils.loadFastaDictionary to load the dictionary, or this method needs to check that the file it is loading is indeed a valid .dict file?
@droazen, can you comment? The use case is that we are using a dictionary file to select regions and specify chromosome lengths for plotting. This dictionary file may not necessarily correspond to the reference fasta file used to generate the data being plotted (as it may have chromosomes that the user does not want to plot removed, for example), so we don't want to allow the user to pass the fasta file.
@samuelklee ReferenceUtils.loadFastaDictionary() is fine to use for loading .dict files, even if a companion fasta file is not present. It is surprising that it doesn't throw if it is given a non-.dict input -- from a reading of the code, I suspect that it instead returns an empty sequence dictionary in that case. You could either have the caller check the return value of ReferenceUtils.loadFastaDictionary() to see if the returned dictionary is empty and throw if it is, or you could modify ReferenceUtils.loadFastaDictionary() itself to throw a UserException if the header returned from its SAMTextHeaderCodec contains no sequence dictionary.
@achevali commented on Wed Feb 15 2017
It should state that the wrong file was given instead of trying to use it.
@samuelklee commented on Wed Feb 22 2017
Hmm, perhaps I should not be using
ReferenceUtils.loadFastaDictionary
to load the dictionary, or this method needs to check that the file it is loading is indeed a valid .dict file?@droazen, can you comment? The use case is that we are using a dictionary file to select regions and specify chromosome lengths for plotting. This dictionary file may not necessarily correspond to the reference fasta file used to generate the data being plotted (as it may have chromosomes that the user does not want to plot removed, for example), so we don't want to allow the user to pass the fasta file.
@samuelklee commented on Wed Apr 19 2017
@droazen can you chime in when you get a chance?
@droazen commented on Wed Apr 19 2017
@samuelklee
ReferenceUtils.loadFastaDictionary()
is fine to use for loading.dict
files, even if a companion fasta file is not present. It is surprising that it doesn't throw if it is given a non-.dict
input -- from a reading of the code, I suspect that it instead returns an empty sequence dictionary in that case. You could either have the caller check the return value ofReferenceUtils.loadFastaDictionary()
to see if the returned dictionary is empty and throw if it is, or you could modifyReferenceUtils.loadFastaDictionary()
itself to throw aUserException
if the header returned from itsSAMTextHeaderCodec
contains no sequence dictionary.@samuelklee commented on Wed Apr 19 2017
I think the responsibility is on the method. Filed https://github.com/broadinstitute/gatk/issues/2609.