griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
137 stars 59 forks source link

pVACseq TypeError: ‘<’ not supported between instances of ‘float’ and ‘str’ #1024

Closed weilinwu97 closed 9 months ago

weilinwu97 commented 11 months ago

Installation Type

Docker

pVACtools Version / Docker Image

griffithlab/pvactools:4.0.1

Python Version

No response

Operating System

No response

Describe the bug

Encoutered this error while running pVACseq to predict neoepitope from VCFs. I am running a cohort analysis for 500-ish samples and this error came up for 10 samples.

2023-07-27T03:07:58.207477414Z Traceback (most recent call last):
2023-07-27T03:07:58.207501380Z   File “/usr/local/bin/pvacseq”, line 8, in <module>
2023-07-27T03:07:58.207507200Z     sys.exit(main())
2023-07-27T03:07:58.207511750Z   File “/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/main.py”, line 123, in main
2023-07-27T03:07:58.207516252Z     args[0].func.main(args[1])
2023-07-27T03:07:58.207520821Z   File “/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/run.py”, line 138, in main
2023-07-27T03:07:58.207525335Z     pipeline.execute()
2023-07-27T03:07:58.207529299Z   File “/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py”, line 452, in execute
2023-07-27T03:07:58.207533351Z     split_parsed_output_files = self.parse_outputs(chunks)
2023-07-27T03:07:58.207537713Z   File “/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py”, line 413, in parse_outputs
2023-07-27T03:07:58.207542249Z     parser.execute()
2023-07-27T03:07:58.207546127Z   File “/usr/local/lib/python3.7/site-packages/pvactools/lib/output_parser.py”, line 629, in execute
2023-07-27T03:07:58.207550459Z     iedb_results = self.process_input_iedb_file(tsv_entries)
2023-07-27T03:07:58.207554507Z   File “/usr/local/lib/python3.7/site-packages/pvactools/lib/output_parser.py”, line 515, in process_input_iedb_file
2023-07-27T03:07:58.207558969Z     iedb_results_with_metrics = self.add_summary_metrics(iedb_results)
2023-07-27T03:07:58.207563435Z   File “/usr/local/lib/python3.7/site-packages/pvactools/lib/output_parser.py”, line 452, in add_summary_metrics
2023-07-27T03:07:58.207568439Z     corresponding_wt = min(result[‘wt_{}s’.format(metric)][best_mt_value_method].values())
2023-07-27T03:07:58.207572243Z TypeError: ‘<’ not supported between instances of ‘float’ and ‘str’

How to reproduce this bug

pvacseq run \ 
--iedb-install-directory /opt/iedb \ 
--pass-only \ 
--n-threads 16 \ 
--net-chop-method cterm \
<path.to.vcf.gz> \
<sample_id> \
'HLA-A*02:07','HLA-A*30:01','HLA-B*15:02','HLA-B*46:01','HLA-C*01:02','HLA-C*08:01' \
all

Input files

No response

Log output

see description

Output files

No response

susannasiebert commented 11 months ago

Hi @weilinwu97,

Thank you for your interest in pVACtools and I apologize that you're running into errors with our software.

Can you please try upgrading to the latest version (4.0.4) and see if the error persists there? If the error still occurs under that version, can you please attach an example input VCF to this ticket to allow us to replicate this error on our end?

Kind regards, Susanna

weilinwu97 commented 11 months ago

Hi Susanna,

The same error presists with the latest docker version.

The input vcf is attached for you to replicate the issue. input.vcf.gz

Cheers, Weilin

susannasiebert commented 11 months ago

Unfortunately, the attached VCF is not VEP-annotated. Can you please share the VEP-annotated VCF?

weilinwu97 commented 11 months ago

Sorry, here is the VEP version. input.vep.vcf.gz

thanks, Weilin

susannasiebert commented 11 months ago

Hi @weilinwu97, unfortunately I'm unable to replicate this issue on my end. Would you be able to share your full output directory with me so I can compare what differences there might with the intermediate files that could possibly be causing this?

weilinwu97 commented 11 months ago

Hi Susanna,

There are a few more annotation steps after VEP in my pipeline. Here is the final vcf after all annotations input.vep.annotated.vcf.gz.

I ran this on CAVATICA which uses AWS cloud instances. Unfortunately I don't have access to the actual compute instance, so I can't share the output dir, I can only see the stderr which is attached in my first post.

Many thanks.

susannasiebert commented 11 months ago

I also was not able to replicate this error with the second input VCF. My run finished successfully. I suspect that there was an error with one of the intermediate prediction files created by pVACtools. I suggest you rerun the affected samples from scratch to see if the issue persists for you. I'm sorry I'm unable to provide more help unless I can either replicate the issue or have access to these intermediate files to further investigate.

weilinwu97 commented 11 months ago

I tried to run this locally using the docker image. Unfortunately, I am using macOS with M1 chip and the tensorflow installed in the docker image was not happy about this. That being said, it did run from end to end WITHOUT issue using the non-neural-network prediction methods (PickPocket, SMM, etc). I re-tested this on the cloud and it worked fine as well, which made me suspect that the error must be coming from outputs from the neural-network prediction methods.

I need to find a way to run it in a linux env to test the neural-network prediction methods. At the mean time, I am happy to run using just the non-neural-network prediction methods.

You can close the ticket now. I will re-open this when I manage to replicate the behaviour locally as well. Thank you.