greydanus / mnist1d

A 1D analogue of the MNIST dataset for measuring spatial biases and answering Science of Deep Learning questions.
Apache License 2.0
190 stars 30 forks source link

Remote code execution possibility with pickled python objects #6

Closed emiruz closed 4 months ago

emiruz commented 1 year ago

Hey thank you for this, but pickled objects in Python can be used to invoke remote code execution. It means I have to trust you in order to use this data. I'd be grateful if you'd consider distributing this data in some other way?

greydanus commented 1 year ago

Emiruz, this is a great point, I'll make the change. Although I can't say it'll be made in the next week or so. In the meantime, you'll either have to trust me or you could try unpickling in a Colab, then saving the file in some other format, and sending that file to me - in which case I could add that file to the repo and make the necessary code changes so that future researchers don't have the same dilemma. Thanks for bringing this issue up!

emiruz commented 1 year ago

The colab route is a good idea. I’ll give it go, repackage the data and send you a pull request when I get a minute.


From: Sam @.> Sent: 28 December 2022 19:06 To: greydanus/mnist1d @.> Cc: Emir U @.>; Author @.> Subject: Re: [greydanus/mnist1d] Remote code execution possibility with pickled python objects (Issue #6)

Emiruz, this is a great point, I'll make the change. Although I can't say it'll be made in the next week or so. In the meantime, you'll either have to trust me or you could try unpickling in a Colab, then saving the file in some other format, and sending that file to me - in which case I could add that file to the repo and make the necessary code changes so that future researchers don't have the same dilemma. Thanks for bringing this issue up!

— Reply to this email directly, view it on GitHubhttps://github.com/greydanus/mnist1d/issues/6#issuecomment-1366861496, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAH2KYO34U7QQVUHF24YNL3WPSFVHANCNFSM6AAAAAATKUBHLY. You are receiving this because you authored the thread.Message ID: @.***>

breuderink commented 1 year ago

I think that the data can be easily exported as a set of NumPy arrays:

np.savez_compressed('mnist1d_data.npz', **data)
np.savez_compressed('mnist1d_data_shuff.npz', **data_shuff)

Where could that be integrated? The Building MNIST1D notebook gives an error when I run it on Colab. It can't find ./static/human_q1.pkl.

greydanus commented 4 months ago

Link to ./static/human_q1.pkl has been fixed (see the Building MNIST-1D notebook, notebook #2, for an example of how to load this properly) ; dataset can still be generated from scratch as documented in the README