Closed rarsenal closed 2 years ago
@rarsenal Thanks for bringing this to our attention. I will try to reproduce this. What version of python are you running? Also, did any errors occur when building the reference genome fasta? This issue seems like it could be related to https://github.com/Illumina/GTCtoVCF/issues/64 but that only occurred with a custom fasta file.
Hi jjzieve,
Thanks for the fast reply! Actually I just found the source of the error. I containerized the various tools for the pipeline we are building, so the container had both python2 and python3 environments built in. After I separated the GTCtoVCF component into a standalone container with only miniconda2 base environment, everything is working as expect. I suppose that GTCtoVCF was inadvertently running on python3 and while it produced no runtime errors, its bytes/string decoding functions are not compatible with python3? Anyhow, thanks again for your attention, and you can close the issue when you see fit.
Glad you found the issue! In hindsight, should've known the byte vs. string issue would be a python2 vs. 3 underlying cause.
Hello, we are trying to apply GTCtoVCF on Illumina's iScan data with Global Diversity Array. We've converted from IDAT to GTC via iaap-cli, but we noticed the VCF output from GTCtoVCF has a few formatting issues that hopefully you could help resolve.
Example line from our output: 1 762320 JHU_1.762319,exm2268640 b'C' T,C . PASS . GT:GQ 2/2:6
Are there environmental variables that we should specify to prevent this behavior?
For reference, we used the manifest from https://support.illumina.com/downloads/infinium-global-diversity-array-v1-product-files.html and the references were built using the provided download_reference.sh