uio-bmi / immuneML

immuneML is a platform for machine learning analysis of adaptive immune receptor repertoire data.
https://immuneml.uio.no
GNU Affero General Public License v3.0
62 stars 29 forks source link

WARNING: ABCMeta: chain was not set for sequence 0, skipping the sequence for matching... #34

Closed KanduriC closed 3 years ago

KanduriC commented 3 years ago

I am trying to to use MatchedRegex encoding + Matches report e.g. with the following yaml specification file, immuneml imported data to reproduce,motif file for MatchedRegex encoding. The log file shows WARNING: ABCMeta: chain was not set for sequence 0, skipping the sequence for matching... for all sequences I guess. I attach the output files of Matches report, which show zero counts across all repertoires. Could you help me figure out the issue .. complete_match_count_table_csv.txt repertoire_sizes_csv.txt

LonnekeScheffer commented 3 years ago

Thanks for reporting! I'll take a look at this. Are you able to upload the data to google drive in a zip? For some reason I am not able to download multiple files/folders (maybe I am missing some rights since it is on your google drive), and also this time unable to zip. Last time I downloaded the files individually, but I tried with a subset of files now and got errors related to pickle, so I think it has to be complete

KanduriC commented 3 years ago

@BlueBasilisk Here is a zip file of the same data.

LonnekeScheffer commented 3 years ago

Hmm something strange is happening, I still get the same Pickle error when trying to import, meaning that I am not even getting to the point of running the code to encounter the results you have.

my error:

Exception: unsupported pickle protocol: 5

An error occurred while parsing the dataset implanted_ostmeyer. See the log above for more details.

ImmuneMLParser: an error occurred during parsing in function _parse_dataset  with parameters: ('implanted_ostmeyer', {'format': 'Pickle', 'params': {'path': '/Users/lonneke/Desktop/chakri_test/simulated_data/repertoire_implanting_rate__0.0005/sim_instruction/exported_dataset/pickle/regularization_rate.iml_dataset', 'result_path': '../../../../Desktop/chakri_test/out/datasets/implanted_ostmeyer/'}}, SymbolTable(), '../../../../Desktop/chakri_test/out/').

For more details on how to write the specification, see the documentation. For technical description of the error, see the log above.

Are you able to import this dataset now with the newest version of immuneML? @pavlovicmilena did something change with the 'pickle protocol' in immuneML recently?

pavlovicmilena commented 3 years ago

not in immuneML, but it could happen with different python versions i think, will check this.

LonnekeScheffer commented 3 years ago

Actually I might have already fixed your original error in the pull request I made earlier today (not merged in yet): https://github.com/uio-bmi/immuneML/pull/33

I see your dataset is based on OLGA, which does not have an explicit 'chains' column. But in the fix I made today the chain is inferred from the V or J gene columns, so then it should be present.

pavlovicmilena commented 3 years ago

ok, i merged your pull request. if it doesn't fix it, i can look into this pickle stuff :)

KanduriC commented 3 years ago

Pulling the latest changes has fixed the issue I reported. Thanks a lot :) Regarding the pickle import error - before updating the code I was perhaps using one of the latest versions of immuneML as I have pulled couple of newer versions last week; the pickle importing works fine for me on both immunohub and local machine; that is the only data that I could provide regarding that issue.

pavlovicmilena commented 3 years ago

Great! Regarding pickle error, this is what I found: https://stackoverflow.com/questions/63329657/python-3-7-error-unsupported-pickle-protocol-5.. maybe it is applicable? But we can solve that problem when we replicate it :)