michelleapaz / T3E

GNU General Public License v3.0
6 stars 0 forks source link

t3e.py traceback error #3

Closed RomeroMatt closed 9 months ago

RomeroMatt commented 10 months ago

Hello, Thanks so much for the tool! I'm very excited to use it. I recently got everything downloaded and running on my apple macbook pro, but I have run into a problem when running t3e.py script. The error I get is: "Traceback (most recent call last): File "/Users/matthewa/Data/scripts/T3E/scripts/t3e.py", line 256, in (te_chr, te_start, te_end, repeat) = line.split("\t") ValueError: too many values to unpack (expected 4)"

I am not sure if this is a mac problem (using zsh) or some other user error on my end. I did notice that when running the main.sh script, I had to replace "grep" commands to "ggrep" - it seems this is because of zsh/mac.

Everything runs perfectly up to that point in the t3e.py script and I checked the .bed files and they all look like the example posted in the ReadME section.

I'd really appreciate any suggestions as to what I'm doing wrong or ways to mitigate this error. Thanks! -Matt

michelleapaz commented 10 months ago

Hi Matt,

Thank you for your message!

This error usually occurs when the number of actual variables are not equal to the number of values to be unpacked. So basically python is confused what to assign to which variable.

Repeat annotation: 1 - chromosome, 2 - start coordinate, 3 - end coordinate, 4 - TE name

It could be a problem with the format of the TE annotation you are using (e.g., an extra tab). I need some extra information from your side to reproduce the error using macOS:

Thanks!

Michelle

RomeroMatt commented 9 months ago

Thanks for the quick response, Michelle! I didn't even think about the repeat annotation file! I downloaded the repeats using UCSC's table browser by selecting "Repeats" as the group and Repeat Masker as the track and downloaded that as a bed file (http://genome.ucsc.edu/cgi-bin/hgTables).

After checking the file, I do have more than 4 columns: chr1 67108753 67109046 L1P5 1892 + chr1 8388315 8388618 AluY 2582 - chr1 25165803 25166380 L1MB5 4085 + chr1 33554185 33554483 AluSc 2285 - chr1 41942894 41943205 AluY 2451 - chr1 50331336 50332274 HAL1 1587 +

I removed the last two columns and reran the script and it seems to be running now.

Thanks, Michelle! I'll keep you updated on the progress. -Matt