Closed lkuchenb closed 2 years ago
Cell 4: The IDs and predicted of PCM differ from the ones in the legacy notebook. I find no IDs which match the one featured of the legacy notebook. I guess the names or data changed, but only by a tiny amount. (eg. -1.93 vs -1.77)
Cell 5: Again, marginally different values and ID compared to the legacy tutorial.
Cell 8: CleavageSitePredictorFactory using proteasmm_c returns marginally different values (eg. -1.93 vs -1.77)
Cell 11: Changed svmtap to smmtap, since svmtap is no longer available.
Cell 13: Legacy demonstrated the function filter_result with [("svmtap",ge, -30)]. I updated svmtap to smmtap and changed to filter greater 1 since the values a mostly in the range of -1 and 1, but maybe another value makes even more sense.
Cell 14: Legacy again used svmtap for the predictions here, changed it to smmtap. Choose a sensible threshold for smmtap so that the number of peptides after TAP transport changes. Preliminarily changed -30 to 1 but I am not sure if we want the highest scores for smmtap. Also, I don’t have UniTope installed, so this part is commented out and should be run again if UniTope wanted.
Cell 15: Legacy again used svmtap for the predictions here, changed it to smmtap. Choose a sensible threshold for smmtap so that the number of peptides after TAP transport changes. Preliminarily changed -30 to 1 but I am not sure if we want the highest scores for smmtap. I commented out the epitope prediction and filtering using SVMHC since this is also no longer available. I was not sure which alternative method makes sense.
This tutorial is fine.
Cell 4: read_fasta with type protein now returns the reference number contained in the file eg. “NP_852610.1” instead of as shown in legacy “Protein_0”; this is probably defined in Core/Protein
Cell 6: There are a lot more methods listed, this might depend on what methods I had installed, so it might make sense to rerun this with only the necessary methods installed. The following methods are missing compared to legacy: unitope 1.0 (not installed from me.) and svmhc (we removed this one.)
Cell 7: results.head shows different examples but the values are the same.
Cell 12: Returns NaNs for all methods (bimas, sim, syfpeithi) for me.
Cell 13: Filter for a meaningful method and value. Updated filter to syfpeithi (was svmhc before, which was removed), but due to all values being NaN, returns empty table.
Cell 15: The value 1.0 is predicted for all while legacy returns the value 0 for (L, L, G, A, T, C, M, F, V) and (S, Y, F, P, E, I, T, H, I).
Cell 5: Presents a user warning. All other values match.
Cell 7: Presents “Biopython Warning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add training N before translation. This may become an error in future.”
Cell 2: Cannot run cell successfully because OptiType is not installed.
Nothing to run in this tutorial.
I believe I did not update anything except for the name here.
Cell 4: Warning that it cannot find transcript NM_001293557 and that the reference number did not match ref to assigned variant
Cell 5: Warning that it cannot find transcript NM_001293557.
Cell 3: BIMAS is not working for me: “No predictions could be made with bimas for given input. Check your epitope length and HLA allele combination.”
Could not run through the remaining cells because Optitope is not installed.
I checked the issues Antonia addressed by comparing the upcoming PR #42 against the legacy branch . I'll go over them step by step: 1. CleavageAndTAPPrediction
Cell 4: The IDs and predicted of PCM differ from the ones in the legacy notebook. I find no IDs which match the one featured of the legacy notebook. I guess the names or data changed, but only by a tiny amount. (eg. -1.93 vs -1.77) Discrepancy probably happened after the update of the proteins.fasta input file. The current output makes more sense, because the right ID and Seqence is imported.
Cell 5: Again, marginally different values and ID compared to the legacy tutorial. See comment above
_Cell 8: CleavageSitePredictorFactory using proteasmmc returns marginally different values (eg. -1.93 vs -1.77) I don't observe that
_Cell 13: Legacy demonstrated the function filterresult with [("svmtap",ge, -30)]. I updated svmtap to smmtap and changed to filter greater 1 since the values a mostly in the range of -1 and 1, but maybe another value makes even more sense. That is to be discussed. I arbitrarily inserted a filter number of 1 as Antonia suggested, but since it's a tutorial it is fine I guess.
Cell 14: Legacy again used svmtap for the predictions here, changed it to smmtap. Choose a sensible threshold for smmtap so that the number of peptides after TAP transport changes. Preliminarily changed -30 to 1 but I am not sure if we want the highest scores for smmtap. Also, I don’t have UniTope installed, so this part is commented out and should be run again if UniTope wanted.
See comment above. UniTope is not supported anymore. Replaced by smm
Cell 14: see comment above
Cell 15: Adjusted to new EpitopePredictionResult structure
3. Epitope Prediction
_Cell 4: read_fasta with type protein now returns the reference number contained in the file eg. “NP_852610.1” instead of as shown in legacy “Protein0”; this is probably defined in Core/Protein
Yes it is. Since the proteins.fasta
file changed (as stated above) that changed as well.
Cell 6: There are a lot more methods listed, this might depend on what methods I had installed, so it might make sense to rerun this with only the necessary methods installed. The following methods are missing compared to legacy: unitope 1.0 (not installed from me.) and svmhc (we removed this one.) New tools have been added eg netmhc family, therefore that is fine
Cell 12: Returns NaNs for all methods (bimas, sim, syfpeithi) for me. Not for me. The values are identical to the legacy branch
Cell 13: Filter for a meaningful method and value. Updated filter to syfpeithi (was svmhc before, which was removed), but due to all values being NaN, returns empty table. See comment above
Cell 15: The value 1.0 is predicted for all while legacy returns the value 0 for (L, L, G, A, T, C, M, F, V) and (S, Y, F, P, E, I, T, H, I) It's random right
5. HLA Typing
Cannot execute this, because the pip way of installing Optitype
is not working anymore (since pip doesn't support python2 anymore). Maybe I'm wrong here
9. Vaccine Design Cell 3: BIMAS is not working for me: “No predictions could be made with bimas for given input. Check your epitope length and HLA allele combination.” Worked for me
The tutorials have been updated accordingly in the PR #42
Thanks @antschum and @jonasscheid! I will add comments to PR #42.
@jonasscheid What is missing here?
The notebooks are still Python 2 and won't run with the ported lib. The docs probably also have some Python 2 specific examples.