Illumina / Nirvana

The nimble & robust variant annotator
https://illumina.github.io/NirvanaDocumentation/
GNU General Public License v3.0
171 stars 44 forks source link

VCF contigs with pipe character (1|NT_113878.1) #20

Closed WestleyASherman closed 6 years ago

WestleyASherman commented 6 years ago

We have some VCF files with chromosome/contig names that include a pipe/vertical-bar character, for example, "1|NT_113878.1". And this seems to cause an exception when running Nirvana on operating systems that don't allow pipe characters in their file/path names:

1|NT_113878.1 ERROR: Illegal characters in path.

Stack trace: at System.IO.PathInternal.CheckInvalidPathChars(String path) at System.IO.Path.Combine(String path1, String path2) at VariantAnnotation.SaReaderUtils.GetReader(String saDir, String ucscReferenceName) in C:...\Nirvana\VariantAnnotation\SaReaderUtils.cs:line 37

Stack trace: at System.IO.PathInternal.CheckInvalidPathChars(String path) at System.IO.Path.Combine(String path1, String path2) at VariantAnnotation.PhyloP.PhylopCommon.GetStream(String directory, String ucscReferenceName) in C:...\Nirvana\VariantAnnotation\PhyloP\PhylopCommon.cs:line 59

And, as would be expected, adding a check for the pipe character is enough to avoid the exceptions:

namespace VariantAnnotation
{
    public static class SaReaderUtils
    {
        private static ISupplementaryAnnotationReader GetReader(string saDir, string ucscReferenceName)
        {
            if (string.IsNullOrEmpty(saDir)) return null;
            if (ucscReferenceName.Contains("|")) return null;
            var saPath = Path.Combine(saDir, ucscReferenceName + ".nsa");
            ...
namespace VariantAnnotation.PhyloP
{
    public static class PhylopCommon
    {
        public static Stream GetStream(string directory, string ucscReferenceName)
        {
            if (string.IsNullOrEmpty(directory)) return null;
            if (ucscReferenceName.Contains("|")) return null;
            var phylopPath = Path.Combine(directory, ucscReferenceName + ".npd");
            ...

But there are almost certainly fixes that are better: cleaner, more thorough, etc. :)

rajatshuvro commented 6 years ago

Thanks for pointing it out. We will add this issue to our backlog.