Illumina / BeadArrayFiles

Python library to parse file formats related to Illumina bead arrays
45 stars 33 forks source link

Error: Exception: GTC file is incomplete #27

Open tengfeixiaozhu opened 3 years ago

tengfeixiaozhu commented 3 years ago

I used iaap-cli software convert idat file to gtc file. When use the gtc file produced by iaap-cli, the beadarrayfile package would produce the error Exception: GTC file is incomplete. The same issue with https://github.com/Illumina/BeadArrayFiles/issues/20 Are the two software incompatible? Looking forward to your reply!

wuzhaoqi1015 commented 1 year ago

This problem is caused by the server default encoding. The server's default encoding must be 'en_utf-8' for it to work. I had this problem before too, when I modified the default encoding, it worked

wuzhaoqi1015 commented 1 year ago

The positioning of the specific problem: BeadArrayUtility.py - read_string() - "result = result.decode("utf-8")"

jjzieve commented 1 year ago

@tengfeixiaozhu This does not appear to be the same issue as https://github.com/Illumina/BeadArrayFiles/issues/20. Unless you're referring to the comment from @mikaelamo? I would try @wuzhaoqi1015 solution and if that works, I'd be happy to review a PR to fix it.

hbhwangg commented 11 months ago

@wuzhaoqi1015 I modified the default encoding to "en_utf-8", But it's not worked and produce 'Exception: GTC file is incmoplete'. Do you know another solution? @jjzieve Is it worked?

wuzhaoqi1015 commented 11 months ago

@hbhwangg this is my locale info: locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8

wuzhaoqi1015 commented 11 months ago

image

wuzhaoqi1015 commented 11 months ago

Another way to solve is edit the encoding in the code directly,

BeadArrayUtility.py - read_string() - "result = result.decode("utf-8")"

and, change ‘utf-8’ to your server locale

jzieve commented 11 months ago

@hbhwangg Did you modify the encoding before the GTC creation (in iaap) or just for the parsing done by this script? My hunch is the scan date or some other field in the GTC is non-ASCII and BeadArrayFiles is failing to get the correct fields because the offset is wrong. GTC is not unicode compatible, unfortunately.