RocHack / bb

Command line Blackboard client
MIT License
72 stars 8 forks source link

ability to get materials uploaded for a course (lectures, etc) #9

Closed jeremywrnr closed 9 years ago

jeremywrnr commented 9 years ago

Getting to see my grades is awesome, but it would also be good if bb could list the course materials that a course has (like lectures or exam material the professor provides), and then give the user the option to download them.

clehner commented 9 years ago

This would be good. Maybe bb materials [course_query] using get_course() to select the course, and then list the course materials and prompt the user to pick & read/download them.

clehner commented 9 years ago

The get_assignments functions could probably be used or adapted for finding course materials to download.

jeremywrnr commented 9 years ago

I made the materials branch, which will work with the 'materials' command, and then either show the user their courses, or look for the course they entered. Then it parses a list of files and prints them, along with folders. Next, I'll try and give the option to download them. image

clehner commented 9 years ago

Nice! When I try it with my classes, the titles of assignments assignment are shown. With one of my classes, the path for uploading the assignment is also included:

$ bb materials 256
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
/webapps/blackboard/execute/uploadAssignment?content_id=_2387337_1&course_id=_359319_1&assign_group_id=&mode=view Homework 1
/webapps/blackboard/execute/uploadAssignment?content_id=_2387362_1&course_id=_359319_1&assign_group_id=&mode=view Homework 2
...

Course materials that are uploads upload or links link are not shown in my output.

Looks like there is a temp file committed, toParse. Also check the spaces/tabs consistency.

jeremywrnr commented 9 years ago

Thank you! Cleaned out the temp file, and formatted it so it is a bit clearer, and also made it so that it tabs in files, but not folder labels.

Yeah, I was using a color label most documents had to parse it:

sed -n '/style="color:#000000;"/{ s_.*<span[^>]*>\(.*\)</span>.*_\1_; p; }'

So if the teacher changes the color it won't work. See below from (my.rochester.edu): image

And then command line output misses the red ones: image

The rest of my uploads are visible, I don't have any assignment or links in any of my classes so I can't test that one out, perhaps there is something else that matches them uniquely?

Finally on the tabs/spaces thing, I set up vim to use spaces, so that is what caused the issue most likely. Is there a good tool that can convert all space indents to tabs that you know of?

clehner commented 9 years ago

I'll take a look at why mine aren't matching.

To convert tabs to spaces in vim I usually do :s/ /\t/g (4 spaces), or :retab (or ==!). I just now read :help retab and apparently it has to be retab!. (the ! is needed for spaces -> tabs). Doing those for the whole file would affect areas that are not indentation though (like the strings in bb_help) so I would search for the spaces and then fix each line or group of lines. Also, when using git diff or e.g. git commit -v, indentation issues may be more noticable because the terminal defaults to 8 spaces for tabs.

jeremywrnr commented 9 years ago

Awesome. I used used :%s/ /\t/gc, and then you can confirm whether you want to do it for each tab or not. This allowed me to avoid replacing groups of four spaces that were inside of strings. Checked over the diff, and it looked like it was more uniform now.

clehner commented 9 years ago

Cool, that is a better way of doing it. Looks good now.

jeremywrnr commented 9 years ago

Okay, I got the full url of the path of the document I want to download, but I am not sure how to get it from the command line. I tried to just curl the file locally, but just an html response instead of the actual document I wanted. How did you end up doing this for quikpay?

Also, I am not quite sure how to get the filetype/appropriate extension for the download (because the link from bb doesn't return a name (that I've seen)), but I will maybe try to parse it and add it onto the file name.

clehner commented 9 years ago

Is there any error message in the html response? Does the response include a link to the actual file to download?

To get the right filename and extension you might be able to use curl's -J option to let the server set the filename with the Content-Disposition response header, and use -O to have curl write the response to that filename. Then with -w '%{filename_effective}' you could have curl write the filename on stdout so you can retrieve it.

e.g.

filename=$(bb_request -OJw '%{filename_effective}' "$url")
echo Downloaded $filename

If the response doesn't include the Content-Disposition header and so -J doesn't help, then you could try using -w '%{content_type}' and parsing the resulting content type to create an extension.

jeremywrnr commented 9 years ago

I was parsing the url's ok, but then I was choosing the first url, which just returns a link to the folder all of them are in, so that what was (part) the problem it seems. The link that I can grab is also not a direct link, and when I tried to get the content type, it returned ambiguous redirect. I will check out the html it is returning and see if I can parse that response for an actual link to download from.

jeremywrnr commented 9 years ago

So I managed to get the right url for the getting the document, and printing that out, meaning if you copied it and pasted it into your web browser bar it would open up (assuming you are logged in there). However, I have not been able to download that document using curl at all... I tried to look at what you did for the assignment uploads but I didn't really follow. Here is what I get instead of a pdf when running bb_request $full_url -o $filename (w/ both url and filename verified):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
clehner commented 9 years ago

Have you tried using -L? That should tell curl to follow a redirect if that is what is happening. Otherwise, what is the output of the request with using -i (to show the headers)?

jeremywrnr commented 9 years ago

Yep, -L worked. Thanks. It doesn't work with the %{filename effective}, so right now I am manually parsing that one to get the filename and extension, and it seems to work.

Right now, it only works with one of my classes (the one I was testing on). For the others, my sed script is not parsing the link. The links it grabs can be observed with the -v option. If it gets the link, it will download the file correctly.

I think I may need to add more lines after I match the details div... will follow up soon.

jeremywrnr commented 9 years ago

Ok, as of b6b4e8a downloading works on all of my classes that have materials. Trying to download a folder currently just returns an html page though, at some point I may make it so that selecting a folder allows you download all the files inside that folder. Let me know it works for you!

clehner commented 9 years ago

The assignments are shown for one of my courses:

Found WRT273 2014FALL 74678 COMMNCTNG YOUR PROF IDENTITY.
Found 6 documents.
1) Course Materials [dir]  
2)  FA 1 Elev. Pitch / Comp. Descrip.
3)  FA2 - Networking Note (Revision)
4)  MidPoint Portfolio
5)  Informational Interview & Reflection Paper
6)  Final Portfolio

I have trouble making downloads:

Choose a material to download: 2
sed: 1: "/  FA 1 Elev. Pitch / Co ...": invalid command code C
Target material: FA 1 Elev. Pitch / Comp. Descrip.
./bb: line 944: : No such file or directory
Downloaded

^ The sed error is from putting the unescaped assignment name in a regex (line 923). Consider using grep -F instead.

Choose a material to download: 6
Target material: Final Portfolio
./bb: line 944: : No such file or directory
Downloaded

One of my courses doesn't show any materials. I'll look into that.

jeremywrnr commented 9 years ago

Ok, realized there was a typo messing up grabbing the link, dcd82ef should work. I also changed it so that grep with parse the query as a string instead of passing it to sed as a regex, so hopefully it can catch your document with the slash.

clehner commented 9 years ago

The items for that course are now matched successfully and I got some downloads to work for files (file). Should we offer downloads for assignments with attachment(s) too?

screenshot of item from materials

How should we represent such an item with multiple attachments to download? Perhaps one link per line?

11)     Machine Problem 4: cr3.c
12)     Machine Problem 4: mp4.pdf

And then if the item is an assignment with no attachments, should we show it in the list at all? assignment with no downloads

jeremywrnr commented 9 years ago

Cool! Yes, this would be the next step to add attachments for assignments. Can you run materials -v and either send me the files or add them to the repo, so I can make a sed regex to grab the links from them?

If there are no attachments, it should not be shown. I think I will do this by checking to see the if the next line is a link, or if the current line has [dir] in it, else delete the line. I can do this.

Also, I will try and simplify the stuff with s_url and t_url, it is likely I made an unnecessarily complex solution.

jeremywrnr commented 9 years ago

Ok, based on your html you sent me, 63022a0 should work with file attachments... let me know how it goes. There is one line per link, and attachment precedes the file name, and the assignment name precedes all attachments:

Machine Problem 4
attachment: cr3.c
attachment: mp4.pdf
clehner commented 9 years ago

Yes, it works. It produces numbered lines not corresponding to a download, which is not ideal, but tolerable:

26)     Machine Problem 3
27)     attachment: mp3.pdf
28) 
29)     Machine Problem 4
30)     attachment: cr3.c
31)     attachment: mp4.pdf
32) 

How should specifying a material or attachment name work?

$ bb materials -h
Usage: bb materials [-v] [<course>] [<document>]

Query for attachment name?

$ bb materials 256 cr3.c
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
No materials found for current course.

Query for assignment name?

$ bb materials 256 'Problem 4'                                                                                                                      
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
query: Problem 4
No materials found for current course.

Query for assignment name with one word:

$ bb materials 256 4
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
Found 5 documents.
1)      Homework 4                     4)       attachment: mp4.pdf
2)      attachment: homework4.pdf  5)   Mini-Homework 4
3)      Machine Problem 4
Choose a file to download: ^C
jeremywrnr commented 9 years ago

Cool, glad to have made some progress. I can parse the last newline off of that, and it should clean up the output for the attachments.

It should take the form of 256 Homework 4, and then return all things that match homework 4 in course 256, but obviously it doesn't seem to be doing that right now... I will also look into why the querying doesn't seem to be working properly, I was also having issues with multi-word queries.

jeremywrnr commented 9 years ago

Ok, the output and the queries should be fixed, but I don't have classes with assignments so I can't be entirely sure. How does 5c98d11 perform?

clehner commented 9 years ago
$ bb materials 256 'Problem 4'
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
Target material: Machine Problem 4
/usr/local/bin/bb: line 1335: : No such file or directory
Downloaded
$ bb materials 256 cr3.c
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
Target material: attachment: cr3.c
Downloaded cr3.c

Good enough

jeremywrnr commented 9 years ago

I made a small change to the attachment parsing... it may work now (hopefully), and I just updated the readme to show the new functionality as well. Think it's merge ready?

clehner commented 9 years ago
$ bb materials 256 'Problem 4'                                                                                                                      
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
Found 3 documents.
1)      Machine Problem 4
2)      attachment: cr3.c
3)      attachment: mp4.pdf
Choose a file to download: 2
Target material: attachment: cr3.c
Downloaded cr3.c
$ bb materials 256 mp4.pdf
Found CSC256 2014FALL 30378 OPERATING SYSTEMS.
Target material: attachment: mp4.pdf
Downloaded mp4.pdf

Nice. I think it's ready to merge.