Closed havardgulldahl closed 8 years ago
Without previous knowledge of Python and without a working installation this is what I came up with.
def download(argv=None):
if argv is None:
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description='Download a file from Jottacloud.')
parser.add_argument('remotefile', help='The path to the file that you want to download')
parser.add_argument('-l', '--loglevel', help='Logging level. Default: %(default)s.',
choices=('debug', 'info', 'warning', 'error'), default='warning')
parser.add_argument('-c', '--checksum', help='Verfy checksum of file after download')
args = parse_args_and_apply_logging_level(parser, argv)
jfs = JFS.JFS()
root_folder = get_root_dir(jfs)
path_to_object = posixpath.join(root_folder.path, args.remotefile)
remote_file = jfs.getObject(path_to_object)
total_size = remote_file.size
with open(remote_file.name, 'wb') as fh:
bytes_read = 0
with ProgressBar(expected_size=total_size) as bar:
for chunk_num, chunk in enumerate(remote_file.stream()):
fh.write(chunk)
bytes_read += len(chunk)
bar.show(bytes_read)
if args.checksum:
md5 = JFS.calculate_md5(data)
if md5 != JFSFile.md5:
print ('''MD5 hashes don't match!''')
answer = input('Continue: [y/n]')
if not answer or answer[0].lower() != 'y':
print('%s was not downloaded successfully' % args.remotefile')
exit(1)
print('%s downloaded successfully' % args.remotefile)
That's not that bad for something you wrote without knowing the language.
But you'll see some issues once you get your installation running (get it straight from github). So get that going, and then keep on coding :)
Here's some things I see immediately
data
.input()
prompt. A red error message (look at clint
) and exit(1)
is enough.argparse.ArgumentParser
docs and see how you can use store_true
to actually get True
or False
for free from argparse
So there are some progress, but got stuck at an error which I couldn't figure out how to fix (commenting out the for loop in calculate_md5 removes the error).
WARNING:py.warnings:c:\python27\lib\site-packages\jottalib\JFS.py:92: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
for data in iter(lambda: fileobject.read(size), u''):
Traceback (most recent call last):
File "C:\Python27\Scripts\jotta-download-script.py", line 9, in <module>
load_entry_point('jottalib==0.4.post1', 'console_scripts', 'jotta-download')()
File "c:\python27\lib\site-packages\jottalib\cli.py", line 240, in download
md5_lf = JFS.calculate_md5(lf)
File "c:\python27\lib\site-packages\jottalib\JFS.py", line 93, in calculate_md5
md5.update(data.encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 10: ordinal not in range(128)
I'm not sure that the way I am accessing the local file are the best. I'm also struggling getting the md5 property from the remote file. Would be nice with a hint in the right direction. =)
if args.checksum:
with open(remote_file.name) as lf:
md5_lf = JFS.calculate_md5(lf)
md5_jf = JFS.JFSFile.md5
if md5_lf != md5_jf:
I think the first error is related to issue #79. Will see if the result there fixes the issue. Anyone that can give me a hand with getting the md5 from the remote file?
After you jfs.getObject(/path/to/file)
and get a JFSFile
object, look at JFSFile.md5
, and in this case remote_file
is already there, so:
md5_lf = JFS.calculate_md5(open(remote_file.name)) # because we've downloaded the file to remote_file.name
md5_jf = remote_file.md5
And take it from there. :+1:
I've been trying to get it to work but the checksum doesn't seem to be correct. Not sure this is a issue that's related to using it under windows or not but. Anyway, below is the code in Cli.py.
with open(remote_file.name, 'wb') as fh:
bytes_read = 0
with ProgressBar(expected_size=total_size) as bar:
for chunk_num, chunk in enumerate(remote_file.stream()):
fh.write(chunk)
bytes_read += len(chunk)
bar.show(bytes_read)
#if args.checksum:
md5_lf = JFS.calculate_md5(open(remote_file.name, 'rb')) #opening in binary mode
md5_jf = remote_file.md5
print md5_lf
print md5_jf
print('%s downloaded successfully' % args.remotefile)
The checksum i get is:
C:\Users\XX>jotta-download jottacloud.pdf
[################################] 219340/219340 - 00:00:00
f8ceede2a2ac0c52f3e3bbeb25d3fa68
9fff650be9fd5a05d531730e4350af51
jottacloud.pdf downloaded successfully
Checking the file in an external md5 checker (http://onlinemd5.com/) gives the value of: 9FFF650BE9FD5A05D531730E4350AF51
Also doing a print data
in JFS.py seems that it is missing out on the last rows. Have tried to figure out why this is but haven't found anything.
File content when opened in notepad:
obj
<</Length 3911/Subtype/XML/Type/Metadata>>stream
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:pdfx="http://ns.adobe.com/pdfx/1.3/">
<xmp:ModifyDate>2013-03-28T12:13:18+01:00</xmp:ModifyDate>
<xmp:CreateDate>2013-03-28T12:13:17+01:00</xmp:CreateDate>
<xmp:MetadataDate>2013-03-28T12:13:18+01:00</xmp:MetadataDate>
<xmp:CreatorTool>Acrobat PDFMaker 11 for Word</xmp:CreatorTool>
<xmpMM:DocumentID>uuid:b8e0d258-8375-49f3-8e23-f7de68210a4d</xmpMM:DocumentID>
<xmpMM:InstanceID>uuid:9625165b-c271-4ea6-9002-fea7e8500cf4</xmpMM:InstanceID>
<xmpMM:subject>
<rdf:Seq>
<rdf:li>50</rdf:li>
</rdf:Seq>
</xmpMM:subject>
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default"/>
</rdf:Alt>
</dc:title>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default"/>
</rdf:Alt>
</dc:description>
<dc:creator>
<rdf:Seq>
<rdf:li>roland</rdf:li>
</rdf:Seq>
</dc:creator>
<pdf:Producer>Adobe PDF Library 11.0</pdf:Producer>
<pdf:Keywords/>
<pdfx:SourceModified>D:20130328111211</pdfx:SourceModified>
<pdfx:Company/>
<pdfx:Comments/>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
endstream
endobj
20 0 obj
<</Filter/FlateDecode/First 6/Length 58/N 1/Type/ObjStm>>stream
hÞ240V0P°±ÑwÎ/Í+Q0Ö÷ÎL)Ž640Š)‚I«RYª˜žZlg` ~ím
endstream
endobj
21 0 obj
<</Filter/FlateDecode/First 6/Length 184/N 1/Type/ObjStm>>stream
hÞlÍA‚@†á¿²7• w4("Iº”t^݉¶Ô‰i%ü÷Ñ¡Û{øx>Ð3¥Õj罿‡Lél¯©m±óðwÓ
c1ï¨+ŒÇ°X&R&H …ùDC uðY •×L•ñj_lJsCV êL¬NÄr°Åá)1”dÿ‰‹¯¸g²}BZªpÕÎUlxsª£ø@=×(Ž;;´¿2è«+Ö^ÎŽÎ7FYö` ¯dIÍ
endstream
endobj
22 0 obj
<</DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode/ID[<D73E8F7CBFE3364DAC1DA07F06F81058><465B159AE6F857409B69D6D4AB883CAB>]/Info 104 0 R/Length 119/Root 106 0 R/Size 105/Type/XRef/W[1 3 1]>>stream
hÞbb &FƆCL@†?ˆd©‘<f ’QH2þš–µ ‘ÌÁâÙ ’ÓÌþ &çH_°,“%Xå:^?›¡,n"Ùþ€Hþ©`]ÓÁ¤Ð
Wî«d“ŒØIÆ?ødGÉÁL2m‡Ä/@€ gõê
endstream
endobj
startxref
116
%%EOF
File content when doing print data
in calculate_md5
in the JFS.py file:
obj
<</Length 3911/Subtype/XML/Type/Metadata>>stream
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:pdfx="http://ns.adobe.com/pdfx/1.3/">
<xmp:ModifyDate>2013-03-28T12:13:18+01:00</xmp:ModifyDate>
<xmp:CreateDate>2013-03-28T12:13:17+01:00</xmp:CreateDate>
<xmp:MetadataDate>2013-03-28T12:13:18+01:00</xmp:MetadataDate>
<xmp:CreatorTool>Acrobat PDFMaker 11 for Word</xmp:CreatorTool>
<xmpMM:DocumentID>uuid:b8e0d258-8375-49f3-8e23-f7de68210a4d</xmpMM:DocumentID>
<xmpMM:InstanceID>uuid:9625165b-c271-4ea6-9002-fea7e8500cf4</xmpMM:InstanceID>
<xmpMM:subject>
<rdf:Seq>
<rdf:li>50</rdf:li>
</rdf:Seq>
</xmpMM:subject>
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default"/>
</rdf:Alt>
</dc:title>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default"/>
</rdf:Alt>
</dc:description>
<dc:creator>
<rdf:Seq>
<rdf:li>roland</rdf:li>
</rdf:Seq>
</dc:creator>
<pdf:Producer>Adobe PDF Library 11.0</pdf:Producer>
<pdf:Keywords/>
<pdfx:SourceModified>D:20130328111211</pdfx:SourceModified>
<pdfx:Company/>
<pdfx:Comments/>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
Sorry, got it to work! Was having one indent to much so it was missing out on the last chunk. =)
How do I go forward and suggest the new code (first time I use github)?
I think I need to rewrite some of the code that was proposed in the version i submitted since there has been quite a lot of changes and fixes since I wrote the code in the first place. Any help is appriciated.
Via private email, the suggestion was raised that our tools could automatically verify the content after download, by comparing the md5 checksum from jottacloud and that of the newly downloaded file.
Sounds like a nice command line option for
download()
incli.py
.This would be a nice way to get the know the codebase for beginners.