oxpig / ANARCI

Antibody Numbering and Antigen Receptor ClassIfication
BSD 3-Clause "New" or "Revised" License
161 stars 84 forks source link

Command line and python library give different result #86

Closed immunomotivees closed 2 months ago

immunomotivees commented 3 months ago

Hello, Thank you for developing and maintaining such a valuable tool for the research community.

I'm a regular user of the ANARCI tool, primarily using it as a library in my projects like so:

from anarci import anarci
results = anarci([(id, sequence)], scheme='imgt', output=False)

I usually don't have any trouble with it, but for some sequences, I found some mistake in the numbering.

For example when I look at the numbering of one of my sequence and more precisely at its CDR3 between the positions 105 and 117, I found a CYS at position 105, which seems off : , ((105, ' '), 'C'),

However, running the same sequence through the command line version gives me a different (and correct, by my checks) numbering: ANARCI -i SEQUENCE output: L 104 C L 105 M

Is this a known issue, does anyone have any clue about this difference ? How do you correctly manipulate your numbering in a python script ?

Thanks for any insight you can provide Best regards, Celia

ALGW71 commented 3 months ago

Hi Celia,

Thanks for raising this, we have not heard of this before but need to investigate.

Would you mind sharing the sequence in which this occurred so we can recreate the error?

If you don't want to make it public you can email to the following address: stat0427@nexus.ox.ac.uk

Many thanks, Alex

immunomotivees commented 3 months ago

Hello Alex, I did try with publicly available sequences and here are my results :

[{'id': '1ae6_L|Light|L_VL', 'whole_sequence': 'DIVMTQAAPSVPVTPGESLSISCRSSKSLLHSNGDTFLYWFLQRPGQSPQLLIYRMSNLASGVPDRFSGSGSGTAFTLRVSRVEAEDVGVYYCMQHLEYPFTFGAGTKLELK', 'whole_sequence_padded': 'DIVMTQAAPSVPVTPGESLS-ISCRSSKSLLH-SNGDTFLYWFLQRPGQSPQLLIYR-------MSNLASGVP-DRFSGSG--SGTAFTLRVSRVEAEDVGVYYCMQHL---EYPFTFGAGTKLELK', 'whole_sequence_delimited': 'DIVMTQAAPSVPVTPGESLSISCRS|SKSLLHSNGDT|FLYWFLQRPGQSPQLLI|YRM|SNLASGVPDRFSGSGSGTAFTLRVSRVEAEDVGVYY|CMQHLEYPFT|FGAGTKLELK', 'chain_type': 'K', 'numbering': [((1, ' '), 'D'), ((2, ' '), 'I'), ((3, ' '), 'V'), ((4, ' '), 'M'), ((5, ' '), 'T'), ((6, ' '), 'Q'), ((7, ' '), 'A'), ((8, ' '), 'A'), ((9, ' '), 'P'), ((10, ' '), 'S'), ((11, ' '), 'V'), ((12, ' '), 'P'), ((13, ' '), 'V'), ((14, ' '), 'T'), ((15, ' '), 'P'), ((16, ' '), 'G'), ((17, ' '), 'E'), ((18, ' '), 'S'), ((19, ' '), 'L'), ((20, ' '), 'S'), ((21, ' '), '-'), ((22, ' '), 'I'), ((23, ' '), 'S'), ((24, ' '), 'C'), ((25, ' '), 'R'), ((26, ' '), 'S'), ((27, ' '), 'S'), ((28, ' '), 'K'), ((29, ' '), 'S'), ((30, ' '), 'L'), ((31, ' '), 'L'), ((32, ' '), 'H'), ((33, ' '), '-'), ((34, ' '), 'S'), ((35, ' '), 'N'), ((36, ' '), 'G'), ((37, ' '), 'D'), ((38, ' '), 'T'), ((39, ' '), 'F'), ((40, ' '), 'L'), ((41, ' '), 'Y'), ((42, ' '), 'W'), ((43, ' '), 'F'), ((44, ' '), 'L'), ((45, ' '), 'Q'), ((46, ' '), 'R'), ((47, ' '), 'P'), ((48, ' '), 'G'), ((49, ' '), 'Q'), ((50, ' '), 'S'), ((51, ' '), 'P'), ((52, ' '), 'Q'), ((53, ' '), 'L'), ((54, ' '), 'L'), ((55, ' '), 'I'), ((56, ' '), 'Y'), ((57, ' '), 'R'), ((58, ' '), '-'), ((59, ' '), '-'), ((60, ' '), '-'), ((61, ' '), '-'), ((62, ' '), '-'), ((63, ' '), '-'), ((64, ' '), '-'), ((65, ' '), 'M'), ((66, ' '), 'S'), ((67, ' '), 'N'), ((68, ' '), 'L'), ((69, ' '), 'A'), ((70, ' '), 'S'), ((71, ' '), 'G'), ((72, ' '), 'V'), ((73, ' '), 'P'), ((74, ' '), '-'), ((75, ' '), 'D'), ((76, ' '), 'R'), ((77, ' '), 'F'), ((78, ' '), 'S'), ((79, ' '), 'G'), ((80, ' '), 'S'), ((81, ' '), 'G'), ((82, ' '), '-'), ((83, ' '), '-'), ((84, ' '), 'S'), ((85, ' '), 'G'), ((86, ' '), 'T'), ((87, ' '), 'A'), ((88, ' '), 'F'), ((89, ' '), 'T'), ((90, ' '), 'L'), ((91, ' '), 'R'), ((92, ' '), 'V'), ((93, ' '), 'S'), ((94, ' '), 'R'), ((95, ' '), 'V'), ((96, ' '), 'E'), ((97, ' '), 'A'), ((98, ' '), 'E'), ((99, ' '), 'D'), ((100, ' '), 'V'), ((101, ' '), 'G'), ((102, ' '), 'V'), ((103, ' '), 'Y'), ((104, ' '), 'Y'), ((105, ' '), 'C'), ((106, ' '), 'M'), ((107, ' '), 'Q'), ((108, ' '), 'H'), ((109, ' '), 'L'), ((110, ' '), '-'), ((111, ' '), '-'), ((112, ' '), '-'), ((113, ' '), 'E'), ((114, ' '), 'Y'), ((115, ' '), 'P'), ((116, ' '), 'F'), ((117, ' '), 'T'), ((118, ' '), 'F'), ((119, ' '), 'G'), ((120, ' '), 'A'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'K'), ((124, ' '), 'L'), ((125, ' '), 'E'), ((126, ' '), 'L'), ((127, ' '), 'K')], 'FR1': 'DIVMTQAAPSVPVTPGESLSISCRS', 'FR1_taille': 24, 'FR1_position': (1, 25), 'CDR1': 'SKSLLHSNGDT', 'CDR1_taille': 10, 'CDR1_position': (26, 36), 'FR2': 'FLYWFLQRPGQSPQLLI', 'FR2_taille': 16, 'FR2_position': (37, 53), 'CDR2': 'YRM', 'CDR2_taille': 2, 'CDR2_position': (54, 56), 'FR3': 'SNLASGVPDRFSGSGSGTAFTLRVSRVEAEDVGVYY', 'FR3_taille': 35, 'FR3_position': (57, 92), 'CDR3': 'CMQHLEYPFT', 'CDR3_taille': 9, 'CDR3_position': (93, 102), 'FR4': 'FGAGTKLELK', 'FR4_taille': 9, 'FR4_position': (103, 112)}] [{'id': '1ad9_L|Light|L_VL', 'whole_sequence': 'DIQMTQSPSTLSASVGDRVTITCRSSKSLLHSNGDTFLYWFQQKPGKAPKLLMYRMSNLASGVPSRFSGSGSGTEFTLTISSLQPDDFATYYCMQHLEYPFTFGQGTKVEVK', 'whole_sequence_padded': 'DIQMTQSPSTLSASVGDRVT-ITCRSSKSLLH-SNGDTFLYWFQQKPGKAPKLLMYR-------MSNLASGVP-SRFSGSG--SGTEFTLTISSLQPDDFATYYCMQHL---EYPFTFGQGTKVEVK', 'whole_sequence_delimited': 'DIQMTQSPSTLSASVGDRVTITCRS|SKSLLHSNGDT|FLYWFQQKPGKAPKLLM|YRM|SNLASGVPSRFSGSGSGTEFTLTISSLQPDDFATYY|CMQHLEYPFT|FGQGTKVEVK', 'chain_type': 'K', 'numbering': [((1, ' '), 'D'), ((2, ' '), 'I'), ((3, ' '), 'Q'), ((4, ' '), 'M'), ((5, ' '), 'T'), ((6, ' '), 'Q'), ((7, ' '), 'S'), ((8, ' '), 'P'), ((9, ' '), 'S'), ((10, ' '), 'T'), ((11, ' '), 'L'), ((12, ' '), 'S'), ((13, ' '), 'A'), ((14, ' '), 'S'), ((15, ' '), 'V'), ((16, ' '), 'G'), ((17, ' '), 'D'), ((18, ' '), 'R'), ((19, ' '), 'V'), ((20, ' '), 'T'), ((21, ' '), '-'), ((22, ' '), 'I'), ((23, ' '), 'T'), ((24, ' '), 'C'), ((25, ' '), 'R'), ((26, ' '), 'S'), ((27, ' '), 'S'), ((28, ' '), 'K'), ((29, ' '), 'S'), ((30, ' '), 'L'), ((31, ' '), 'L'), ((32, ' '), 'H'), ((33, ' '), '-'), ((34, ' '), 'S'), ((35, ' '), 'N'), ((36, ' '), 'G'), ((37, ' '), 'D'), ((38, ' '), 'T'), ((39, ' '), 'F'), ((40, ' '), 'L'), ((41, ' '), 'Y'), ((42, ' '), 'W'), ((43, ' '), 'F'), ((44, ' '), 'Q'), ((45, ' '), 'Q'), ((46, ' '), 'K'), ((47, ' '), 'P'), ((48, ' '), 'G'), ((49, ' '), 'K'), ((50, ' '), 'A'), ((51, ' '), 'P'), ((52, ' '), 'K'), ((53, ' '), 'L'), ((54, ' '), 'L'), ((55, ' '), 'M'), ((56, ' '), 'Y'), ((57, ' '), 'R'), ((58, ' '), '-'), ((59, ' '), '-'), ((60, ' '), '-'), ((61, ' '), '-'), ((62, ' '), '-'), ((63, ' '), '-'), ((64, ' '), '-'), ((65, ' '), 'M'), ((66, ' '), 'S'), ((67, ' '), 'N'), ((68, ' '), 'L'), ((69, ' '), 'A'), ((70, ' '), 'S'), ((71, ' '), 'G'), ((72, ' '), 'V'), ((73, ' '), 'P'), ((74, ' '), '-'), ((75, ' '), 'S'), ((76, ' '), 'R'), ((77, ' '), 'F'), ((78, ' '), 'S'), ((79, ' '), 'G'), ((80, ' '), 'S'), ((81, ' '), 'G'), ((82, ' '), '-'), ((83, ' '), '-'), ((84, ' '), 'S'), ((85, ' '), 'G'), ((86, ' '), 'T'), ((87, ' '), 'E'), ((88, ' '), 'F'), ((89, ' '), 'T'), ((90, ' '), 'L'), ((91, ' '), 'T'), ((92, ' '), 'I'), ((93, ' '), 'S'), ((94, ' '), 'S'), ((95, ' '), 'L'), ((96, ' '), 'Q'), ((97, ' '), 'P'), ((98, ' '), 'D'), ((99, ' '), 'D'), ((100, ' '), 'F'), ((101, ' '), 'A'), ((102, ' '), 'T'), ((103, ' '), 'Y'), ((104, ' '), 'Y'), ((105, ' '), 'C'), ((106, ' '), 'M'), ((107, ' '), 'Q'), ((108, ' '), 'H'), ((109, ' '), 'L'), ((110, ' '), '-'), ((111, ' '), '-'), ((112, ' '), '-'), ((113, ' '), 'E'), ((114, ' '), 'Y'), ((115, ' '), 'P'), ((116, ' '), 'F'), ((117, ' '), 'T'), ((118, ' '), 'F'), ((119, ' '), 'G'), ((120, ' '), 'Q'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'K'), ((124, ' '), 'V'), ((125, ' '), 'E'), ((126, ' '), 'V'), ((127, ' '), 'K')], 'FR1': 'DIQMTQSPSTLSASVGDRVTITCRS', 'FR1_taille': 24, 'FR1_position': (1, 25), 'CDR1': 'SKSLLHSNGDT', 'CDR1_taille': 10, 'CDR1_position': (26, 36), 'FR2': 'FLYWFQQKPGKAPKLLM', 'FR2_taille': 16, 'FR2_position': (37, 53), 'CDR2': 'YRM', 'CDR2_taille': 2, 'CDR2_position': (54, 56), 'FR3': 'SNLASGVPSRFSGSGSGTEFTLTISSLQPDDFATYY', 'FR3_taille': 35, 'FR3_position': (57, 92), 'CDR3': 'CMQHLEYPFT', 'CDR3_taille': 9, 'CDR3_position': (93, 102), 'FR4': 'FGQGTKVEVK', 'FR4_taille': 9, 'FR4_position': (103, 112)}] [{'id': '5v7u_L|Light|L_VL', 'whole_sequence': 'DVVMTQSPLSLPVTPGEPASISCRSSQSLLHRSGHKYLHWYLQRPGQSPQVLIYLGSNRASGVPDRFSGSGSGTDFTLKISRVEAEDVGLYYCMQTLQTPWTFGQGTKVEIK', 'whole_sequence_padded': 'DVVMTQSPLSLPVTPGEPAS-ISCRSSQSLLH-RSGHKYLHWYLQRPGQSPQVLIYL-------GSNRASGVP-DRFSGSG--SGTDFTLKISRVEAEDVGLYYCMQTL---QTPWTFGQGTKVEIK', 'whole_sequence_delimited': 'DVVMTQSPLSLPVTPGEPASISCRS|SQSLLHRSGHK|YLHWYLQRPGQSPQVLI|YLG|SNRASGVPDRFSGSGSGTDFTLKISRVEAEDVGLYY|CMQTLQTPWT|FGQGTKVEIK', 'chain_type': 'K', 'numbering': [((1, ' '), 'D'), ((2, ' '), 'V'), ((3, ' '), 'V'), ((4, ' '), 'M'), ((5, ' '), 'T'), ((6, ' '), 'Q'), ((7, ' '), 'S'), ((8, ' '), 'P'), ((9, ' '), 'L'), ((10, ' '), 'S'), ((11, ' '), 'L'), ((12, ' '), 'P'), ((13, ' '), 'V'), ((14, ' '), 'T'), ((15, ' '), 'P'), ((16, ' '), 'G'), ((17, ' '), 'E'), ((18, ' '), 'P'), ((19, ' '), 'A'), ((20, ' '), 'S'), ((21, ' '), '-'), ((22, ' '), 'I'), ((23, ' '), 'S'), ((24, ' '), 'C'), ((25, ' '), 'R'), ((26, ' '), 'S'), ((27, ' '), 'S'), ((28, ' '), 'Q'), ((29, ' '), 'S'), ((30, ' '), 'L'), ((31, ' '), 'L'), ((32, ' '), 'H'), ((33, ' '), '-'), ((34, ' '), 'R'), ((35, ' '), 'S'), ((36, ' '), 'G'), ((37, ' '), 'H'), ((38, ' '), 'K'), ((39, ' '), 'Y'), ((40, ' '), 'L'), ((41, ' '), 'H'), ((42, ' '), 'W'), ((43, ' '), 'Y'), ((44, ' '), 'L'), ((45, ' '), 'Q'), ((46, ' '), 'R'), ((47, ' '), 'P'), ((48, ' '), 'G'), ((49, ' '), 'Q'), ((50, ' '), 'S'), ((51, ' '), 'P'), ((52, ' '), 'Q'), ((53, ' '), 'V'), ((54, ' '), 'L'), ((55, ' '), 'I'), ((56, ' '), 'Y'), ((57, ' '), 'L'), ((58, ' '), '-'), ((59, ' '), '-'), ((60, ' '), '-'), ((61, ' '), '-'), ((62, ' '), '-'), ((63, ' '), '-'), ((64, ' '), '-'), ((65, ' '), 'G'), ((66, ' '), 'S'), ((67, ' '), 'N'), ((68, ' '), 'R'), ((69, ' '), 'A'), ((70, ' '), 'S'), ((71, ' '), 'G'), ((72, ' '), 'V'), ((73, ' '), 'P'), ((74, ' '), '-'), ((75, ' '), 'D'), ((76, ' '), 'R'), ((77, ' '), 'F'), ((78, ' '), 'S'), ((79, ' '), 'G'), ((80, ' '), 'S'), ((81, ' '), 'G'), ((82, ' '), '-'), ((83, ' '), '-'), ((84, ' '), 'S'), ((85, ' '), 'G'), ((86, ' '), 'T'), ((87, ' '), 'D'), ((88, ' '), 'F'), ((89, ' '), 'T'), ((90, ' '), 'L'), ((91, ' '), 'K'), ((92, ' '), 'I'), ((93, ' '), 'S'), ((94, ' '), 'R'), ((95, ' '), 'V'), ((96, ' '), 'E'), ((97, ' '), 'A'), ((98, ' '), 'E'), ((99, ' '), 'D'), ((100, ' '), 'V'), ((101, ' '), 'G'), ((102, ' '), 'L'), ((103, ' '), 'Y'), ((104, ' '), 'Y'), ((105, ' '), 'C'), ((106, ' '), 'M'), ((107, ' '), 'Q'), ((108, ' '), 'T'), ((109, ' '), 'L'), ((110, ' '), '-'), ((111, ' '), '-'), ((112, ' '), '-'), ((113, ' '), 'Q'), ((114, ' '), 'T'), ((115, ' '), 'P'), ((116, ' '), 'W'), ((117, ' '), 'T'), ((118, ' '), 'F'), ((119, ' '), 'G'), ((120, ' '), 'Q'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'K'), ((124, ' '), 'V'), ((125, ' '), 'E'), ((126, ' '), 'I'), ((127, ' '), 'K')], 'FR1': 'DVVMTQSPLSLPVTPGEPASISCRS', 'FR1_taille': 24, 'FR1_position': (1, 25), 'CDR1': 'SQSLLHRSGHK', 'CDR1_taille': 10, 'CDR1_position': (26, 36), 'FR2': 'YLHWYLQRPGQSPQVLI', 'FR2_taille': 16, 'FR2_position': (37, 53), 'CDR2': 'YLG', 'CDR2_taille': 2, 'CDR2_position': (54, 56), 'FR3': 'SNRASGVPDRFSGSGSGTDFTLKISRVEAEDVGLYY', 'FR3_taille': 35, 'FR3_position': (57, 92), 'CDR3': 'CMQTLQTPWT', 'CDR3_taille': 9, 'CDR3_position': (93, 102), 'FR4': 'FGQGTKVEIK', 'FR4_taille': 9, 'FR4_position': (103, 112)}] [{'id': '5uxq_L|Light|L_VL', 'whole_sequence': 'TVVTQSPLSLPVTPGEAASMSCTSTQSLRHSNGANYLAWYQHKPGQSPRLLIRLGSQRASGVPDRFSGSGSGTHFTLKISRVEPEDAAIYYCMQGLNRPWTFGKGTKLEIK', 'whole_sequence_padded': '-TVVTQSPLSLPVTPGEAAS-MSCTSTQSLRH-SNGANYLAWYQHKPGQSPRLLIRL-------GSQRASGVP-DRFSGSG--SGTHFTLKISRVEPEDAAIYYCMQGL---NRPWTFGKGTKLEIK', 'whole_sequence_delimited': 'TVVTQSPLSLPVTPGEAASMSCTS|TQSLRHSNGAN|YLAWYQHKPGQSPRLLI|RLG|SQRASGVPDRFSGSGSGTHFTLKISRVEPEDAAIYY|CMQGLNRPWT|FGKGTKLEIK', 'chain_type': 'K', 'numbering': [((1, ' '), '-'), ((2, ' '), 'T'), ((3, ' '), 'V'), ((4, ' '), 'V'), ((5, ' '), 'T'), ((6, ' '), 'Q'), ((7, ' '), 'S'), ((8, ' '), 'P'), ((9, ' '), 'L'), ((10, ' '), 'S'), ((11, ' '), 'L'), ((12, ' '), 'P'), ((13, ' '), 'V'), ((14, ' '), 'T'), ((15, ' '), 'P'), ((16, ' '), 'G'), ((17, ' '), 'E'), ((18, ' '), 'A'), ((19, ' '), 'A'), ((20, ' '), 'S'), ((21, ' '), '-'), ((22, ' '), 'M'), ((23, ' '), 'S'), ((24, ' '), 'C'), ((25, ' '), 'T'), ((26, ' '), 'S'), ((27, ' '), 'T'), ((28, ' '), 'Q'), ((29, ' '), 'S'), ((30, ' '), 'L'), ((31, ' '), 'R'), ((32, ' '), 'H'), ((33, ' '), '-'), ((34, ' '), 'S'), ((35, ' '), 'N'), ((36, ' '), 'G'), ((37, ' '), 'A'), ((38, ' '), 'N'), ((39, ' '), 'Y'), ((40, ' '), 'L'), ((41, ' '), 'A'), ((42, ' '), 'W'), ((43, ' '), 'Y'), ((44, ' '), 'Q'), ((45, ' '), 'H'), ((46, ' '), 'K'), ((47, ' '), 'P'), ((48, ' '), 'G'), ((49, ' '), 'Q'), ((50, ' '), 'S'), ((51, ' '), 'P'), ((52, ' '), 'R'), ((53, ' '), 'L'), ((54, ' '), 'L'), ((55, ' '), 'I'), ((56, ' '), 'R'), ((57, ' '), 'L'), ((58, ' '), '-'), ((59, ' '), '-'), ((60, ' '), '-'), ((61, ' '), '-'), ((62, ' '), '-'), ((63, ' '), '-'), ((64, ' '), '-'), ((65, ' '), 'G'), ((66, ' '), 'S'), ((67, ' '), 'Q'), ((68, ' '), 'R'), ((69, ' '), 'A'), ((70, ' '), 'S'), ((71, ' '), 'G'), ((72, ' '), 'V'), ((73, ' '), 'P'), ((74, ' '), '-'), ((75, ' '), 'D'), ((76, ' '), 'R'), ((77, ' '), 'F'), ((78, ' '), 'S'), ((79, ' '), 'G'), ((80, ' '), 'S'), ((81, ' '), 'G'), ((82, ' '), '-'), ((83, ' '), '-'), ((84, ' '), 'S'), ((85, ' '), 'G'), ((86, ' '), 'T'), ((87, ' '), 'H'), ((88, ' '), 'F'), ((89, ' '), 'T'), ((90, ' '), 'L'), ((91, ' '), 'K'), ((92, ' '), 'I'), ((93, ' '), 'S'), ((94, ' '), 'R'), ((95, ' '), 'V'), ((96, ' '), 'E'), ((97, ' '), 'P'), ((98, ' '), 'E'), ((99, ' '), 'D'), ((100, ' '), 'A'), ((101, ' '), 'A'), ((102, ' '), 'I'), ((103, ' '), 'Y'), ((104, ' '), 'Y'), ((105, ' '), 'C'), ((106, ' '), 'M'), ((107, ' '), 'Q'), ((108, ' '), 'G'), ((109, ' '), 'L'), ((110, ' '), '-'), ((111, ' '), '-'), ((112, ' '), '-'), ((113, ' '), 'N'), ((114, ' '), 'R'), ((115, ' '), 'P'), ((116, ' '), 'W'), ((117, ' '), 'T'), ((118, ' '), 'F'), ((119, ' '), 'G'), ((120, ' '), 'K'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'K'), ((124, ' '), 'L'), ((125, ' '), 'E'), ((126, ' '), 'I'), ((127, ' '), 'K')], 'FR1': 'TVVTQSPLSLPVTPGEAASMSCTS', 'FR1_taille': 23, 'FR1_position': (1, 24), 'CDR1': 'TQSLRHSNGAN', 'CDR1_taille': 10, 'CDR1_position': (25, 35), 'FR2': 'YLAWYQHKPGQSPRLLI', 'FR2_taille': 16, 'FR2_position': (36, 52), 'CDR2': 'RLG', 'CDR2_taille': 2, 'CDR2_position': (53, 55), 'FR3': 'SQRASGVPDRFSGSGSGTHFTLKISRVEPEDAAIYY', 'FR3_taille': 35, 'FR3_position': (56, 91), 'CDR3': 'CMQGLNRPWT', 'CDR3_taille': 9, 'CDR3_position': (92, 101), 'FR4': 'FGKGTKLEIK', 'FR4_taille': 9, 'FR4_position': (102, 111)}] [{'id': '5u15_B|Light|B_VL', 'whole_sequence': 'QSALTQPASVSGSPGQSITISCTGTSSDVGSYNLVSWYQQHPGKAPKLMIYEVSKRPSGVSNRFSGSKSGNTASLTISGLQAEDEADYYCCSYAGSSTVIFGGGTKLTVL', 'whole_sequence_padded': 'QSALTQPAS-VSGSPGQSITISCTGTSSDVG---SYNLVSWYQQHPGKAPKLMIYEV-------SKRPSGVS-NRFSGSK--SGNTASLTISGLQAEDEADYYCCSYAG---SSTVIFGGGTKLTVL', 'whole_sequence_delimited': 'QSALTQPASVSGSPGQSITISCTGT|SSDVGSYNL|VSWYQQHPGKAPKLMIY|EVS|KRPSGVSNRFSGSKSGNTASLTISGLQAEDEADYYC|CSYAGSSTVI|FGGGTKLTVL', 'chain_type': 'L', 'numbering': [((1, ' '), 'Q'), ((2, ' '), 'S'), ((3, ' '), 'A'), ((4, ' '), 'L'), ((5, ' '), 'T'), ((6, ' '), 'Q'), ((7, ' '), 'P'), ((8, ' '), 'A'), ((9, ' '), 'S'), ((10, ' '), '-'), ((11, ' '), 'V'), ((12, ' '), 'S'), ((13, ' '), 'G'), ((14, ' '), 'S'), ((15, ' '), 'P'), ((16, ' '), 'G'), ((17, ' '), 'Q'), ((18, ' '), 'S'), ((19, ' '), 'I'), ((20, ' '), 'T'), ((21, ' '), 'I'), ((22, ' '), 'S'), ((23, ' '), 'C'), ((24, ' '), 'T'), ((25, ' '), 'G'), ((26, ' '), 'T'), ((27, ' '), 'S'), ((28, ' '), 'S'), ((29, ' '), 'D'), ((30, ' '), 'V'), ((31, ' '), 'G'), ((32, ' '), '-'), ((33, ' '), '-'), ((34, ' '), '-'), ((35, ' '), 'S'), ((36, ' '), 'Y'), ((37, ' '), 'N'), ((38, ' '), 'L'), ((39, ' '), 'V'), ((40, ' '), 'S'), ((41, ' '), 'W'), ((42, ' '), 'Y'), ((43, ' '), 'Q'), ((44, ' '), 'Q'), ((45, ' '), 'H'), ((46, ' '), 'P'), ((47, ' '), 'G'), ((48, ' '), 'K'), ((49, ' '), 'A'), ((50, ' '), 'P'), ((51, ' '), 'K'), ((52, ' '), 'L'), ((53, ' '), 'M'), ((54, ' '), 'I'), ((55, ' '), 'Y'), ((56, ' '), 'E'), ((57, ' '), 'V'), ((58, ' '), '-'), ((59, ' '), '-'), ((60, ' '), '-'), ((61, ' '), '-'), ((62, ' '), '-'), ((63, ' '), '-'), ((64, ' '), '-'), ((65, ' '), 'S'), ((66, ' '), 'K'), ((67, ' '), 'R'), ((68, ' '), 'P'), ((69, ' '), 'S'), ((70, ' '), 'G'), ((71, ' '), 'V'), ((72, ' '), 'S'), ((73, ' '), '-'), ((74, ' '), 'N'), ((75, ' '), 'R'), ((76, ' '), 'F'), ((77, ' '), 'S'), ((78, ' '), 'G'), ((79, ' '), 'S'), ((80, ' '), 'K'), ((81, ' '), '-'), ((82, ' '), '-'), ((83, ' '), 'S'), ((84, ' '), 'G'), ((85, ' '), 'N'), ((86, ' '), 'T'), ((87, ' '), 'A'), ((88, ' '), 'S'), ((89, ' '), 'L'), ((90, ' '), 'T'), ((91, ' '), 'I'), ((92, ' '), 'S'), ((93, ' '), 'G'), ((94, ' '), 'L'), ((95, ' '), 'Q'), ((96, ' '), 'A'), ((97, ' '), 'E'), ((98, ' '), 'D'), ((99, ' '), 'E'), ((100, ' '), 'A'), ((101, ' '), 'D'), ((102, ' '), 'Y'), ((103, ' '), 'Y'), ((104, ' '), 'C'), ((105, ' '), 'C'), ((106, ' '), 'S'), ((107, ' '), 'Y'), ((108, ' '), 'A'), ((109, ' '), 'G'), ((110, ' '), '-'), ((111, ' '), '-'), ((112, ' '), '-'), ((113, ' '), 'S'), ((114, ' '), 'S'), ((115, ' '), 'T'), ((116, ' '), 'V'), ((117, ' '), 'I'), ((118, ' '), 'F'), ((119, ' '), 'G'), ((120, ' '), 'G'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'K'), ((124, ' '), 'L'), ((125, ' '), 'T'), ((126, ' '), 'V'), ((127, ' '), 'L')], 'FR1': 'QSALTQPASVSGSPGQSITISCTGT', 'FR1_taille': 24, 'FR1_position': (1, 25), 'CDR1': 'SSDVGSYNL', 'CDR1_taille': 8, 'CDR1_position': (26, 34), 'FR2': 'VSWYQQHPGKAPKLMIY', 'FR2_taille': 16, 'FR2_position': (35, 51), 'CDR2': 'EVS', 'CDR2_taille': 2, 'CDR2_position': (52, 54), 'FR3': 'KRPSGVSNRFSGSKSGNTASLTISGLQAEDEADYYC', 'FR3_taille': 35, 'FR3_position': (55, 90), 'CDR3': 'CSYAGSSTVI', 'CDR3_taille': 9, 'CDR3_position': (91, 100), 'FR4': 'FGGGTKLTVL', 'FR4_taille': 9, 'FR4_position': (101, 110)}] [{'id': '1FL5_2|Chains B, D[auth H]|ANTIBODY GERMLINE PRECURSOR TO ANTIBODY 28B4|Homo sapiens (9606)_VH', 'whole_sequence': 'QVQLVESGGGLVQPGGSLRLSCATSGFTFTDYYMSWVRQPPGKALEWLGFIRNKANGYTTEYSASVKGRFTISRDNSQSILYLQMNTLRAEDSATYYCARDGSYAMDYWGQGTSVTVSS', 'whole_sequence_padded': 'QVQLVESGG-GLVQP-GGSLRLSCATSGFT-----FTDYYMSWVRQPPGKALEWLGFIRNKANGYTTEYSASVK-GRFTISRDNSQSILYLQMNTLRAEDSATYYCARDG-SYAMDYWGQGTSVTVSS', 'whole_sequence_delimited': 'QVQLVESGGGLVQPGGSLRLSCAT|SGFTFTD|YYMSWVRQPPGKALEWL|GFIRNKANGY|TTEYSASVKGRFTISRDNSQSILYLQMNTLRAEDSATY|YCARDGSYAMDY|WGQGTSVTVSS', 'chain_type': 'H', 'numbering': [((1, ' '), 'Q'), ((2, ' '), 'V'), ((3, ' '), 'Q'), ((4, ' '), 'L'), ((5, ' '), 'V'), ((6, ' '), 'E'), ((7, ' '), 'S'), ((8, ' '), 'G'), ((9, ' '), 'G'), ((10, ' '), '-'), ((11, ' '), 'G'), ((12, ' '), 'L'), ((13, ' '), 'V'), ((14, ' '), 'Q'), ((15, ' '), 'P'), ((16, ' '), '-'), ((17, ' '), 'G'), ((18, ' '), 'G'), ((19, ' '), 'S'), ((20, ' '), 'L'), ((21, ' '), 'R'), ((22, ' '), 'L'), ((23, ' '), 'S'), ((24, ' '), 'C'), ((25, ' '), 'A'), ((26, ' '), 'T'), ((27, ' '), 'S'), ((28, ' '), 'G'), ((29, ' '), 'F'), ((30, ' '), 'T'), ((31, ' '), '-'), ((32, ' '), '-'), ((33, ' '), '-'), ((34, ' '), '-'), ((35, ' '), '-'), ((36, ' '), 'F'), ((37, ' '), 'T'), ((38, ' '), 'D'), ((39, ' '), 'Y'), ((40, ' '), 'Y'), ((41, ' '), 'M'), ((42, ' '), 'S'), ((43, ' '), 'W'), ((44, ' '), 'V'), ((45, ' '), 'R'), ((46, ' '), 'Q'), ((47, ' '), 'P'), ((48, ' '), 'P'), ((49, ' '), 'G'), ((50, ' '), 'K'), ((51, ' '), 'A'), ((52, ' '), 'L'), ((53, ' '), 'E'), ((54, ' '), 'W'), ((55, ' '), 'L'), ((56, ' '), 'G'), ((57, ' '), 'F'), ((58, ' '), 'I'), ((59, ' '), 'R'), ((60, ' '), 'N'), ((61, ' '), 'K'), ((62, ' '), 'A'), ((63, ' '), 'N'), ((64, ' '), 'G'), ((65, ' '), 'Y'), ((66, ' '), 'T'), ((67, ' '), 'T'), ((68, ' '), 'E'), ((69, ' '), 'Y'), ((70, ' '), 'S'), ((71, ' '), 'A'), ((72, ' '), 'S'), ((73, ' '), 'V'), ((74, ' '), 'K'), ((75, ' '), '-'), ((76, ' '), 'G'), ((77, ' '), 'R'), ((78, ' '), 'F'), ((79, ' '), 'T'), ((80, ' '), 'I'), ((81, ' '), 'S'), ((82, ' '), 'R'), ((83, ' '), 'D'), ((84, ' '), 'N'), ((85, ' '), 'S'), ((86, ' '), 'Q'), ((87, ' '), 'S'), ((88, ' '), 'I'), ((89, ' '), 'L'), ((90, ' '), 'Y'), ((91, ' '), 'L'), ((92, ' '), 'Q'), ((93, ' '), 'M'), ((94, ' '), 'N'), ((95, ' '), 'T'), ((96, ' '), 'L'), ((97, ' '), 'R'), ((98, ' '), 'A'), ((99, ' '), 'E'), ((100, ' '), 'D'), ((101, ' '), 'S'), ((102, ' '), 'A'), ((103, ' '), 'T'), ((104, ' '), 'Y'), ((105, ' '), 'Y'), ((106, ' '), 'C'), ((107, ' '), 'A'), ((108, ' '), 'R'), ((109, ' '), 'D'), ((110, ' '), 'G'), ((111, ' '), '-'), ((112, ' '), 'S'), ((113, ' '), 'Y'), ((114, ' '), 'A'), ((115, ' '), 'M'), ((116, ' '), 'D'), ((117, ' '), 'Y'), ((118, ' '), 'W'), ((119, ' '), 'G'), ((120, ' '), 'Q'), ((121, ' '), 'G'), ((122, ' '), 'T'), ((123, ' '), 'S'), ((124, ' '), 'V'), ((125, ' '), 'T'), ((126, ' '), 'V'), ((127, ' '), 'S'), ((128, ' '), 'S')], 'FR1': 'QVQLVESGGGLVQPGGSLRLSCAT', 'FR1_taille': 23, 'FR1_position': (1, 24), 'CDR1': 'SGFTFTD', 'CDR1_taille': 6, 'CDR1_position': (25, 31), 'FR2': 'YYMSWVRQPPGKALEWL', 'FR2_taille': 16, 'FR2_position': (32, 48), 'CDR2': 'GFIRNKANGY', 'CDR2_taille': 9, 'CDR2_position': (49, 58), 'FR3': 'TTEYSASVKGRFTISRDNSQSILYLQMNTLRAEDSATY', 'FR3_taille': 37, 'FR3_position': (59, 96), 'CDR3': 'YCARDGSYAMDY', 'CDR3_taille': 11, 'CDR3_position': (97, 108), 'FR4': 'WGQGTSVTVSS', 'FR4_taille': 10, 'FR4_position': (109, 119)}]

I tried running the same script but with a different coding environment, both having anarci installed with this command : conda install bioconda::anarci and I get different results

Thank you very much for your help, Celia

ALGW71 commented 3 months ago

Thanks Celia, we will investigate this and get back to you. Thanks for raising this issue!

ALGW71 commented 2 months ago

Hi Celia,

With the sequences you have given me I do not see a difference between the command line and the python API.

I have tested using the GitHub version of ANARCI and the conda version, and although the conda version gives the wrong results, the output are the same on the command line and the Python API.

I have tested with three sequences and there is no difference my end:

Are you sure when you run on the command line and in Python your conda env and the ANARCI being called are the same versions?

You are correct that the Bioconda version of ANARCI gives worse results, however that is why we recommend to use the GitHub version.

Please let me know if you still see two different results depending on the command line or API usage, after checking with fresh conda environments etc.

Many thanks, Alex

ALGW71 commented 2 months ago

Cannot replicate behaviour.

immunomotivees commented 2 months ago

Thank you very much for your response, I was using conda version and since a use the GitHub one I don't have a problem anymore