uwescience / sqlshare

Documentation and help for the SQLShare project
escience.washington.edu/sqlshare
7 stars 2 forks source link

Uploading large files #34

Open emmats opened 10 years ago

emmats commented 10 years ago

I'd like to upload a 6.22 GB file to SQLshare and the simple upload doesn't seem to work. I know there's a work-around using iPython - are there instructions for how to do this somewhere? Thanks!

sr320 commented 10 years ago

I believe you are thinking about the python client. Instructions are at https://github.com/uwescience/sqlshare-pythonclient

emmats commented 10 years ago

Will this work if one has never used/configured python before? I have python launcher 2.7. When I tried installing python libraries I got the following error: can't open file 'setup.py': [Errno 2] No such file or directory

sr320 commented 10 years ago

Yes. Did you cd sqlshare-pythonclient before you python setup.py install? This is a common mistake, and would result in your error. see https://github.com/uwescience/sqlshare-pythonclient#option-1-download-the-raw-source-code

emmats commented 10 years ago

When I try to cd sqlshare-pythonclient it says there is no such file or directory.

kubu4 commented 10 years ago

Can you provide a link to your IPython notebook so we can see exactly what you've done/tried?

emmats commented 10 years ago

Is that a dig at my lack of iPython notebook? Here's what I typed in terminal: Last login: Wed Apr 9 18:41:26 on ttys000 Emma-Timmins-Schiffmans-MacBook-Pro:~ emmatimminsschiffman$ git clone git://github.com/uwescience/sqlshare-pythonclient.git

Agreeing to the Xcode/iOS license requires admin privileges, please re-run as root via sudo.

Emma-Timmins-Schiffmans-MacBook-Pro:~ emmatimminsschiffman$ cd sqlshare-pythonclient -bash: cd: sqlshare-pythonclient: No such file or directory Emma-Timmins-Schiffmans-MacBook-Pro:~ emmatimminsschiffman$

kubu4 commented 10 years ago

See this part? "Agreeing to the Xcode/iOS license requires admin privileges, please re-run as root via sudo."

You need to follow the instruction. So, you need to re-run the previous command with "sudo" typed in front of it:

$sudo git clone git://github.com/uwescience/sqlshare-pythonclient.git

emmats commented 10 years ago

That went very differently. I should reconsider my policy of ignoring things I don't understand in the hopes they don't matter. Everything went well through step 2 and now I'm trying to do the API configuration. After running sudo python setup.py install I typed: mkdir -p ~/.sqlshare vim ~/.sqlshare/config [sqlshare] user=emmats@uw.edu password=[my password]

And it spits out this: 4 files to edit

~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
"~/.sqlshare/config" [New File]

kubu4 commented 10 years ago

You're using the editor vim (that's the "vim" in $vim ~/.sqlshare/config [sqlshare] user=emmats@uw.edu password=[my password])?

Personally, I'd just edit the config file in this fashion: $open -a TextEdit ~/.sqlshare/config

That should open your file in the TextEdit application on your Mac. Then, paste the following into that file (replace "your-sql-share-account-name" with your actual username and replace "your-sql-share-API-key" with the key that you generated:

[sqlshare] user=your-sql-share-account-name password=your-sql-share-API-key

Save the file and quit TextEdit.

Hopefully that'll take care of things for you.

kubu4 commented 10 years ago

Sorry, Emma, Macintosh may not create a new file if one does not exist using the info I posted above. Since I can't test out what I typed above, the following will definitely make the config file: $touch ~/.sqlshare/config

THEN, you can use TextEdit to add the stuff with your username/API-key: $open -a TextEdit ~/.sqlshare/config

emmats commented 10 years ago

So I went through the following sequence of commands in terminal and everything went fine... cd sqlshare-pythonclient sudo python setup.py install mkdir -p ~/.sqlshare

until: d-69-91-154-176:sqlshare-pythonclient emmatimminsschiffman$ $touch ~/.sqlshare/config -bash: /Users/emmatimminsschiffman/.sqlshare/config: No such file or directory

NB: I have to leave in 20 minutes and may not have time to look at this again tonight but will definitely work on it tomorrow :) Thanks!

kubu4 commented 10 years ago

I think you may have skipped the step where you make the ".sqlshare" directory:

$mkdir -p $HOME/.sqlshare

Then, you can use the "touch" command to make the config file.

emmats commented 10 years ago

That all worked. So then I tried to upload my files:

python multiupload.py /Users/emmatimminsschiffman/Desktop/swissprot_protein_names_03212014.txt /Users/emmatimminsschiffman/Desktop/swissprot_GO_association_03212014.txt

And it said: python: can't open file 'multiupload.py': [Errno 2] No such file or directory

Did I skip a step again? I got no errors up until that point. You can find the list of code I ran at the bottom of this evernote page: https://www.evernote.com/shard/s242/sh/67ba13af-2406-429f-b189-68758e58a074/aa9ddff097674ad9bb3fc8504a2fb10f

kubu4 commented 10 years ago

Change directories to your sqlshare installation "tools" location. I'm guessing you can enter the following to get there: $cd sqlshare-pythonclient/tools

Then, run your command.

But, this brings up an issue I haven't been able to figure out either. I thought this process added the SQLshare API to the system PATH. However, I can't get this to work unless I'm in the actual directory that the SQLshare modules exist. And, I've checked that Python has the modules location in its PATH, so I don't know why we can't just execute a command like you issued above from anywhere in the system...

Does anyone have any input on this?

dhalperi commented 10 years ago

the tools are not installed to the system path


Daniel Halperin Director of Research for Scalable Analytics eScience Institute University of Washington

On Thu, Apr 17, 2014 at 12:26 PM, kubu4 notifications@github.com wrote:

Change directories to your sqlshare installation "tools" location. I'm guessing you can enter the following to get there: $cd sqlshare-pythonclient/tools

But, this brings up an issue I haven't been able to figure out either. I thought this process added the SQLshare API to the system PATH. However, I can't get this to work unless I'm in the actual directory that the SQLshare modules exist. And, I've checked that Python has the modules location in its PATH, so I don't know why we can't just execute a command like you issued above from anywhere in the system...

Does anyone have any input on this?

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/sqlshare/issues/34#issuecomment-40752864 .

kubu4 commented 10 years ago

Hey Dan! For us newbs, what does this part of the installation accomplish?

echo 'export PYTHONPATH=$PYTHONPATH':pwd >> ~/.bash_profile

emmats commented 10 years ago

so I changed to the tools folder and then tried to upload the files again and I know you said it wouldn't work, but I tried anyway. Got this error: sqlshare.SQLShareError: code: 404 : {"Detail":"User record not found"}

I thought SR did this all the time! Why is it so hard?? So how to I get to the directory where the SQL modules exist? I'm still not exactly sure what everything I'm doing means.

kubu4 commented 10 years ago

The error message suggests that the username you have entered in your config file isn't correct. What did you enter for your SQLshare username in your config file?

emmats commented 10 years ago

I played around with a couple different user names and looks like I got it. Thanks!

emmats commented 10 years ago

well, except for this error: NameError: global name 'chunk_count' is not defined

What is a chunk_count and how do I define it?

dhalperi commented 10 years ago

Wow this thing is so buggy. I just pushed a new commit that should fix that error message. To get it:

git pull then reinstall using setup.py etc as in the instructions. I won't write them out here because I'm sure to make a mistake.

dhalperi commented 10 years ago

Good thing Bill has a student rewriting it! :)

emmats commented 10 years ago

Silly question - when do I actually type in "git pull"? I did it when I first opened terminal and got this error: fatal: Not a git repository (or any of the parent directories): .git

I then tried it after switching my directory to pythonclient and got this: error: cannot open .git/FETCH_HEAD: Permission denied

Sorry, I'm just not very good at command line.

dhalperi commented 10 years ago

Hmm it seems like you were doing the right thing but as the wrong user. You should be running as the same user you ran git clone.— Sent from my phone

On Thu, Apr 17, 2014 at 3:11 PM, emmats notifications@github.com wrote:

Silly question - when do I actually type in "git pull"? I did it when I first opened terminal and got this error: fatal: Not a git repository (or any of the parent directories): .git I then tried it after switching my directory to pythonclient and got this: error: cannot open .git/FETCH_HEAD: Permission denied

Sorry, I'm just not very good at command line.

Reply to this email directly or view it on GitHub: https://github.com/uwescience/sqlshare/issues/34#issuecomment-40768917

dhalperi commented 10 years ago

For example if you ran "sudo git clone" before you will need to use a sudo here as well— Sent from my phone

On Thu, Apr 17, 2014 at 3:22 PM, Daniel Halperin dhalperi@cs.washington.edu wrote:

Hmm it seems like you were doing the right thing but as the wrong user. You should be running as the same user you ran git clone.— Sent from my phone On Thu, Apr 17, 2014 at 3:11 PM, emmats notifications@github.com wrote:

Silly question - when do I actually type in "git pull"? I did it when I first opened terminal and got this error: fatal: Not a git repository (or any of the parent directories): .git I then tried it after switching my directory to pythonclient and got this: error: cannot open .git/FETCH_HEAD: Permission denied

Sorry, I'm just not very good at command line.

Reply to this email directly or view it on GitHub: https://github.com/uwescience/sqlshare/issues/34#issuecomment-40768917

billhowe commented 10 years ago

It's complaining about the API key; maybe double check it in your sqlshare config file.

Your API key is:

46751347274689db4ceb3534e1e6bae2

On Fri, Apr 18, 2014 at 10:18 AM, emmats notifications@github.com wrote:

That worked until I ran multiupload (see below). I'm not sure what's wrong with my api user key because my user name is emmats@washington.edu and my password is correct. Is multiupload really picky about file format? These aren't really csv files, they are txt files that I changed the extension on (I got the same error when they were .txt).

d-69-91-218-187:tools emmatimminsschiffman$ python multiupload.py /Users/emmatimminsschiffman/Desktop/swissprot_protein_names_03212014.csv /Users/emmatimminsschiffman/Desktop/swissprot_GO_association_03212014.csv uploading /Users/emmatimminsschiffman/Desktop/swissprot_protein_names_03212014.csv uploading /Users/emmatimminsschiffman/Desktop/swissprot_protein_names_03212014.csv into ['swissprot_protein_names_03212014.csv'] processing chunk line 0 to 919860 (0.432422161102 s elapsed) pushing /Users/emmatimminsschiffman/Desktop/swissprot_protein_names_03212014.csv... parsing {u'Detail': u'Invalid api key for user: emmats@washington.edu'}... Error uploading data in the chunk starting at pos 0 (lines 0 to 919860): code: 400 : <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"" http://www.w3.org/TR/html4/strict.dtd"> Bad Request

Bad Request

HTTP Error 400. The request is badly formed.

Found no data to upload in Traceback (most recent call last): File "multiupload.py", line 40, in main() File "multiupload.py", line 37, in main multiupload(args, options.username, options.password) File "multiupload.py", line 25, in multiupload print "Successfully uploaded " + response TypeError: cannot concatenate 'str' and 'NoneType' objects

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/sqlshare/issues/34#issuecomment-40826115 .

emmats commented 10 years ago

Thanks, Bill, I actually got through that with Steven's help and now have another error. He's going to help me in person this afternoon.

billhowe commented 10 years ago

Ah, then I posted your API publicly for nothing. :)

On Fri, Apr 18, 2014 at 11:17 AM, emmats notifications@github.com wrote:

Thanks, Bill, I actually got through that with Steven's help and now have another error. He's going to help me in person this afternoon.

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/sqlshare/issues/34#issuecomment-40831223 .