Illumina / BeadArrayFiles

Python library to parse file formats related to Illumina bead arrays
46 stars 34 forks source link

Error: Exception: GTC file is incomplete #27

Open YHWen-bio opened 3 years ago

YHWen-bio commented 3 years ago

I used iaap-cli software convert idat file to gtc file. When use the gtc file produced by iaap-cli, the beadarrayfile package would produce the error Exception: GTC file is incomplete. The same issue with https://github.com/Illumina/BeadArrayFiles/issues/20 Are the two software incompatible? Looking forward to your reply!

wuzhaoqi1015 commented 2 years ago

This problem is caused by the server default encoding. The server's default encoding must be 'en_utf-8' for it to work. I had this problem before too, when I modified the default encoding, it worked

wuzhaoqi1015 commented 2 years ago

The positioning of the specific problem: BeadArrayUtility.py - read_string() - "result = result.decode("utf-8")"

jjzieve commented 2 years ago

@tengfeixiaozhu This does not appear to be the same issue as https://github.com/Illumina/BeadArrayFiles/issues/20. Unless you're referring to the comment from @mikaelamo? I would try @wuzhaoqi1015 solution and if that works, I'd be happy to review a PR to fix it.

hbhwangg commented 1 year ago

@wuzhaoqi1015 I modified the default encoding to "en_utf-8", But it's not worked and produce 'Exception: GTC file is incmoplete'. Do you know another solution? @jjzieve Is it worked?

wuzhaoqi1015 commented 1 year ago

@hbhwangg this is my locale info: locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8

wuzhaoqi1015 commented 1 year ago

image

wuzhaoqi1015 commented 1 year ago

Another way to solve is edit the encoding in the code directly,

BeadArrayUtility.py - read_string() - "result = result.decode("utf-8")"

and, change ‘utf-8’ to your server locale

jzieve commented 1 year ago

@hbhwangg Did you modify the encoding before the GTC creation (in iaap) or just for the parsing done by this script? My hunch is the scan date or some other field in the GTC is non-ASCII and BeadArrayFiles is failing to get the correct fields because the offset is wrong. GTC is not unicode compatible, unfortunately.