sprky0 / jstor

112 stars 20 forks source link

keepgrabbing2.py? #1

Open speedplane opened 9 years ago

speedplane commented 9 years ago

Any idea what happened to keepgrabbing2.py? It's mentioned in the indictment, paragraph 29:

https://www.docketalarm.com/cases/Massachusetts_District_Court/1--11-cr-10260/%20USA_v._Swartz/53/

sprky0 commented 9 years ago

Good question -- let me know if you're able to find a scan anywhere, would love to include it here as well. I poked around but haven't had any luck yet.

speedplane commented 8 years ago

Interestingly, all of the files in Aaron's case were set to be publicly released, following a complex redaction procedure between Aaron's Estate, JSTOR/MIT, and the US Attorney General's office.

So someone should have keepgrabbing2.py and may be allowed to share it publicly. It may make sense to reach out to Aaron's former attorney, Michael J. Pineault of the firm Clements & Pineault, LLP (email: mpineault at clementspineault.com).

davidrdz93 commented 7 years ago

What is "Cookie: TENACIOUS=" in line 14?

speedplane commented 7 years ago

I sent an email to Mr. Pineault inquiring about the documents that were set to be made available.

davidrdz93 commented 7 years ago

Great

davidrdz93 commented 7 years ago

Le us know!!

aashumallik commented 6 years ago

does anyone know if the documents were made available

davidrdz93 commented 6 years ago

Why would you run curl --socks5 command with -H with a long random number? Does it confuse the server ??

sprky0 commented 6 years ago

@davidrdz93 That argument (-H) adds a header to the HTTP request w/ cURL. In this case, the header is spoofing a client held cookie, named TENACIOUS, which is for each request populated with a random number.

As to the reason for this, it likely has something to do with JSTOR's API, for example, perhaps it consumes a cookie called TENACIOUS which relates to user sessions for the purpose of rate limiting. Anyway, way to know for sure would be to inspect some real requests for comparison to what this generates.