Closed renyuan001 closed 3 months ago
Hi @renyuan001,
Based on the information you provided, it seems that the HLA alleles format are missing a "*", see this section in the tutorial (https://snaf.readthedocs.io/en/latest/tutorial.html#identify-mhc-bound-neoantigens-t-antigen). Let me know if that solves the problem, if not let me know we can dig further.
Best, Frank
I check the sample_hla.txt. sample hla LYB.bed HLA-A24:02,HLA-A11:01,HLA-B40:01,HLA-B15:11,HLA-C03:03,HLA-C15:02 FF.bed HLA-A24:02,HLA-A02:07,HLA-B46:01,HLA-B40:01,HLA-C03:04,HLA-C01:02 ZCZ.bed HLA-A26:01,HLA-A24:02,HLA-B35:01,HLA-B59:01,HLA-C03:03,HLA-C01:02 FYW.bed HLA-A02:01,HLA-A30:01,HLA-B58:01,HLA-B51:01,HLA-C07:04,HLA-C03:02 LLM.bed HLA-A33:03,HLA-A30:01,HLA-B44:03,HLA-B58:01,HLA-C14:03,HLA-C03:02 WXM.bed HLA-A33:03,HLA-A02:03,HLA-B38:02,HLA-B13:01,HLA-C07:02,HLA-C03:04 tumor03.bed HLA-A30:01,HLA-A03:01,HLA-B35:01,HLA-B13:02,HLA-C06:02,HLA-C04:01
When I submit the content here, the ""can't display, but the "" were included in the sample_hla.txt.
Hi @renyuan001
Are you trying to say the asterick is in the sample file?
If so, any chance the netMHCpan was not properly set up? I don't know if you can access the youtube video I recorded for setting up the netMHCpan (https://www.youtube.com/watch?v=KrAzbR5mRIQ), basically make sure you download the /data
and modify your netMHCpan
script accordingly.
Your netMHCpan 4.1 folder:
The netMHCpan script:
Let me know if that solves the problem.
Best, Frank
Hi @renyuan001,
It indeed seems you did everything. Would you be comfortable sharing the counts file
and the sample_hla file
(I guess it's the same as you showed here) to me, and I'll test it on my end?
You can send me an email to (guangyuan.li@nyulangone.org) if the data is meant to be private.
Best, Frank
jcmq = snaf.JunctionCountMatrixQuery(junction_count_matrix=df,cores=40,add_control=add_control,outdir='result') reduce valid NeoJunction from 57300 to 9158 because they are present in GTEx reduce valid Neojunction from 9158 to 7200 because they are present in added control tcga_control
sample_to_hla = pd.read_csv('sample_hla.txt',sep='\t',index_col=0)['hla'].to_dict() hlas = [hla_string.split(',') for hla_string in df.columns.map(sample_to_hla)]
jcmq.run(hlas=hlas,outdir='./result') junction_count_matrix: (57300, 66) cores: 30 valid: 7200 invalid: 50100 cond_df: (57300, 66) subset: (7200, 66) translated: list of 7200 nj objects cond_subset_df: (7200, 66) results: list of length 2
[1] make sure netMHCpan path is set correctly netMHCpan_path = '/home/ry-03/data/SNAF/netMHCpan-4.1/netMHCpan'
[2] make sure HLA allele format is correct sample hla LYB.bed HLA-A24:02,HLA-A11:01,HLA-B40:01,HLA-B15:11,HLA-C03:03,HLA-C15:02
The colname of df is the same order of the rowname of sample_hla.txt.
The error still encountered and burden_stage3.txt frequency_stage3.txt x_neoantigen_frequency_stage3.pdf x_occurence_frequency_stage3.pdf These files were empty.