Closed michaelgallifrey closed 4 months ago
Could you update to the latest version of the Kaggle API and let us know if this is still a problem? If it is, could you provide more detail on the edits you made before pushing? I'm unable to repro the problem.
Thanks for looking into it and attempting a repro.
Unfortunately, I am still experiencing the issue:
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle -v
Kaggle API 1.6.14
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle kernels pull michaelgallifrey/exercise-model-validation -p try-again -m
Source code and metadata downloaded to try-again
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle kernels push -p try-again
500 - An internal server error occurred. Please ensure that your API client is up to date. If it is, please report a bug at github.com/Kaggle/kaggle-api - InternalServerError
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$
As for the edit, I simply changed "You've built a model. In this exercise you will test how good your model is." to "You've built a model. In this exercise you will test how good your model is. Will it upload?" in the "Recap" section of https://www.kaggle.com/kernels/fork/1259097. It appears to happen with any edit I make though.
I wasn't able to reproduce your precise problem, but I think you were hitting a bug in the server that has been fixed.
I was able to get an error, which may be what you should have gotten:
Kernel push error: Notebook not found
The problem is I had not versioned the notebook. After I created a version (and ran it, not quick version), then I could push it with no problem.
Let me know if that helps.
Thanks for all your hard work on this. Still a no go :(
I saved a version (using "Save & Run All (Commit)"), then did the following:
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle kernels pull michaelgallifrey/notebook86b2ec8431 -p test3 -m
Source code and metadata downloaded to test3
Changed "In this exercise you will test how good your model is." to "In this exercise you will test how good your model is and if you can push" and then:
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle kernels push -p test3
500 - An internal server error occurred. Please ensure that your API client is up to date. If it is, please report a bug at github.com/Kaggle/kaggle-api - InternalServerError
abc@2a12a90d60dd:/mnt/hdd/src/kaggle$ kaggle -v
Kaggle API 1.6.14
Anything else you want me to try?
Thanks for the quick response. I'll have to dig into this further.
On Tue, Jun 11, 2024 at 7:57 PM mgallifrey @.***> wrote:
Thanks for all your hard work on this. Still a no go :(
I saved a version (using "Save & Run All (Commit)"), then did the following:
@.***:/mnt/hdd/src/kaggle$ kaggle kernels pull michaelgallifrey/notebook86b2ec8431 -p test3 -m Source code and metadata downloaded to test3
Changed "In this exercise you will test how good your model is." to "In this exercise you will test how good your model is and if you can push" and then:
@.:/mnt/hdd/src/kaggle$ kaggle kernels push -p test3 500 - An internal server error occurred. Please ensure that your API client is up to date. If it is, please report a bug at github.com/Kaggle/kaggle-api - InternalServerError @.:/mnt/hdd/src/kaggle$ kaggle -v Kaggle API 1.6.14
Anything else you want me to try?
— Reply to this email directly, view it on GitHub https://github.com/Kaggle/kaggle-api/issues/574#issuecomment-2162009278, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA7VDNW32WKYVSZ5XOX4M3ZG62KJAVCNFSM6AAAAABHA6TTY2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRSGAYDSMRXHA . You are receiving this because you commented.Message ID: @.***>
@mgallifrey Sorry for the delay. It occurs to me that there could be an issue with line endings. I wasn't able to reproduce the problem, but I'm pretty sure my editor didn't change any line endings, either. Can you check the before/after content of your *.ipynb
file to see if the line endings changed? If that isn't the problem, could you make your notebook public so I can test using it?
Good idea! No difference in line endings (that I can tell), but it looks like at minimum (diffing one line files is hard) the order of some of the JSON got switched around by the editor. Is there any reason to believe there's something private in the ipynb file or would it be safe for me to attach the before and after here for you to have a look?
For what it's worth, I don't think my editor is particularly strange: I'm editing via the VSCode Jupyter plugin (albeit via an open source VSCode fork)
I can't rule out the possibility that the reordering is causing the problem. If you pull/push without editing, does it work?
If not, you can attach the before and after versions here.
Ok, I can confirm that I can push and pull when no changes are made.
Your hunch about line endings was a good one: I went ahead and put the before and after files into https://www.jsondiff.com/, and it looks like VSCode is changing the value of "source" in each cell from a string to an array of strings (with each line as an element, still each terminated with '\n'). Ordering aside, that appears to be the only semantic difference
I'm assuming the VSCode output is still valid ipynb (although I haven't checked the spec); if so, can it be supported by the API?
Thanks for checking that.
I think our version of JL is kinda old. I suspect VSCode is targeting a newer JL spec than what we're using. I don't know if there are any plans to update JL (not saying no, just I don't know). VSCode has a huge number of settings. Is there a way to tell it you're working with v2, not v4, JL files? If not, your best bet is to use a dumber editor.
Pardon my ignorance: what does JL stand for?
In any case, no setting that I could find, unfortunately :(
FWIW, the ipynb file that gets pulled down by kernels pull
says it's nbformat version 4.4:
"nbformat": 4,
"nbformat_minor": 4
and the reference schema for 4.4 says source
is supposed to be "represented as an array of lines". The docs say either an array or a string is fine. Totally get that this likely isn't a priority; figured I'd share either way though!
(edited to fix a typo)
Sorry, JL is JupyterLab, the notebook editor. It's an open source project. The latest version is 4.2.3.
Thanks for the links. Drilling down to the schema definition for source
, I see that either a string or array of strings is accepted, so I wonder if there is some other problem that is preventing push
from working.
Can you attach your edited notebook? I'll try to repro the problem, then look at the server logs to see what's breaking (if I'm lucky enough to find anything :)
Nice catch! That's what I get for just reading the description and not diving into the definition.
And sure! I've attached both the original (or at least something pulled down via kernels pull
; it has some non-VSCode edits that were successfully push
ed) and the same file after being edited in VSCode. They're bundled as a zip file because GitHub won't let me upload IPYNB files.
If I get a chance, I might write a little script that turns the source
arrays into string and try pushing the resulting notebook to confirm that's the issue.
Thanks. I haven't had a chance to look at your files yet, but I did read some source code and think the source
being an array of strings is the problem. We need to use the Google Cloud protobuffer file format to upload everything. Our protobuf definition for push
only allows strings as the source code. In theory, we should be able to make the Python client detect an array of strings and convert it to a single string (essentially embedding the script you described into the kaggle
client), but I have to admit this isn't a very high priority right now.
If you had some free time and wanted to make a contribution, the code to modify is (I think) at this point in the source. But I understand that setting up a dev environment for kaggle-api
is a bit time-consuming (having done it a couple times).
Sounds good! I'll take a stab at it when I get a chance.
You're not wrong about the dev environment being somewhat time-intensive :)
I came across some dependency issues; I filed a bug and submitted a PR for that too.
Just tested this in kaggle-1.6.17 and can confirm it works now!
Excellent! Thanks for the update.
When attempting to use the Kaggle client to pull, edit and push notebooks associated with the Introduction To Machine Learning course,
kaggle kernels push
results in an internal server error. Steps to reproduce:In case it helps, here is the kernel-metadata.json (auto-created by the
-m
flag on thepull
):