matsengrp / vampire

🧛 Deep generative models for TCR sequences 🧛
Apache License 2.0
16 stars 4 forks source link

apply_along_axis is trimming strings to be the same length #82

Closed matsen closed 5 years ago

matsen commented 5 years ago

Well this is freaky:

EGEFCCCHNQNFWVAMNPIANYCCCCMHMH  TCRBV07-07  TCRBJ02-07                                                                                                                                                                                                                                                                        
AGAVPVKASHFPYFGIALRKCWMTREH     TCRBV23-or  TCRBJ01-05                                                                                                                                                                                                                                                                        
WDFRHNRYGIEVFWWRWHSHGFPHDFGAMD  TCRBV23-or  TCRBJ01-06
CASSLGGNYGYFF                   TCRBV27-01  TCRBJ01-01                                                                                                                                                                                                                                                                        
CASSLLGNNEATF                   TCRBV30-01  TCRBJ01-01                                                                                                                                                                                                                                                                        
CASSLEGSGGYEQTF                 TCRBV02-01  TCRBJ02-01                                                                                                                                                                                                                                                                        
SGEWPERMWPVPTTQLWGWNSNNHLWLNIY  TCRBVA-or0  TCRBJ02-07

The orphon genes are getting truncated.

matsen commented 5 years ago

This is because I'm using apply_along_axis from numpy: https://github.com/numpy/numpy/issues/8352

This is a known problem.