Open infWang opened 6 months ago
Hi @infWang,
Sorry for the inconvenience you are encountering on your end, one of other users brought the exact same issue and what ended up happening for her is netMHCpan
path was wrong, so basically netMHCpan was not properly run and there's actually no neoantigens in the output, that's why there's an pandas error because its an empty data frame.
So if my guess is right, right now you should have a few output file, but theburden3
file is actually empty (all zero in the text file). In that case, would you check if your netMHCpan
path is properly set? Particuarly, See below for incorrect path as an example.
# incorrect
netMHCpan_path = '/user/ligk2e/netMHCpan-4.1
# correct
netMHCpan_path = '/user/ligk2e/netMHCpan-4.1/netMHCpan'
If this is not the issue, would you mind providing me with your code, stdout, stderr and how your current folder looks like, so I can help you debug? If you don't feel like sharing here, you can also directly email me (guangyuan.li@nyulangone.org).
Just let me know, Frank
@frankligy Thank you for your prompt response and helpful guidance.
I have checked the netMHCpan path and realized that it was indeed set incorrectly. After correcting the path as you suggested, the issue has been resolved. I appreciate your assistance in identifying the root cause of the problem.
Best regards
Are there any other scenarios where this error may be encountered? I have gotten the same error when running both NetMHCpan
and MHCflurry
.
I am running on two melanoma patients from Hugo et al. 2016. SNAF
reports 838 candidate neojunctions before the error:
WARNING: DEPRECATED USAGE: Forwarding SINGULARITYENV_TMPDIR as environment variable will not be supported in the future, use APPTAINERENV_TMPDIR instead
WARNING: DEPRECATED USAGE: Forwarding SINGULARITYENV_NXF_DEBUG as environment variable will not be supported in the future, use APPTAINERENV_NXF_DEBUG instead
/bin/bash: line 0: cd: /home/spvensko/dev-raft/projects/ots-splice-test/work/b7/d3c0914bde31b4c85a08b1bc440ecb: No such file or directory
Matplotlib created a temporary cache directory at /tmp/matplotlib-9qhtgorn because the default path (/home/spvensko/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2024-01-29 16:58:36 starting initialization
Current loaded gtex cohort with shape (13908, 2629)
Adding cohort tcga_control with shape (13398, 705) to the database
now the shape of control db is (14027, 3334)
Adding cohort gtex_skin with shape (12891, 313) to the database
now the shape of control db is (14027, 3647)
2024-01-29 17:00:20 finishing initialization
reduce valid NeoJunction from 14046 to 1319 because they are present in GTEx
reduce valid Neojunction from 1319 to 877 because they are present in added control tcga_control
reduce valid Neojunction from 877 to 838 because they are present in added control gtex_skin
Are there any other scenarios where this error may be encountered? I have gotten the same error when running both
NetMHCpan
andMHCflurry
.I am running on two melanoma patients from Hugo et al. 2016.
SNAF
reports 838 candidate neojunctions before the error:WARNING: DEPRECATED USAGE: Forwarding SINGULARITYENV_TMPDIR as environment variable will not be supported in the future, use APPTAINERENV_TMPDIR instead WARNING: DEPRECATED USAGE: Forwarding SINGULARITYENV_NXF_DEBUG as environment variable will not be supported in the future, use APPTAINERENV_NXF_DEBUG instead /bin/bash: line 0: cd: /home/spvensko/dev-raft/projects/ots-splice-test/work/b7/d3c0914bde31b4c85a08b1bc440ecb: No such file or directory Matplotlib created a temporary cache directory at /tmp/matplotlib-9qhtgorn because the default path (/home/spvensko/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. 2024-01-29 16:58:36 starting initialization Current loaded gtex cohort with shape (13908, 2629) Adding cohort tcga_control with shape (13398, 705) to the database now the shape of control db is (14027, 3334) Adding cohort gtex_skin with shape (12891, 313) to the database now the shape of control db is (14027, 3647) 2024-01-29 17:00:20 finishing initialization reduce valid NeoJunction from 14046 to 1319 because they are present in GTEx reduce valid Neojunction from 1319 to 877 because they are present in added control tcga_control reduce valid Neojunction from 877 to 838 because they are present in added control gtex_skin
Hi @spvensko,
This error, based on my current interactions with users, seem to be the results that the whole neoantigen prediction fails, so there's a empty dataframe at the end. Especially you mention right after the filtering step, you got this error, that seems that the neoantigen prediction doesn't work at all.
Like I mentioned, one reason is the path, but if you confirm this is not the issue, another scenario I just helped users debug is the HLA format, one of the users use format like A*02:01
instead of HLA-A*02:01
, which can cause prediction fail as well.
If this is still not the case, would you mind sharing your code, stdout, stderr and how the result folder looks like before erroring out? So I can further look into that.
Best, Frank
Thank you very much for your excellent work. I encountered some errors while running my own data following the tutorial. Here is the error log:
snaf.JunctionCountMatrixQuery.generate_results(path='./zhangjiang_data/sanf_res/after_prediction.p',outdir='./zhangjiang_data/sanf_res/')
I would greatly appreciate your guidance whenever it is convenient for you. Thank you for your kind assistance.