Open GoogleCodeExporter opened 9 years ago
Darn it I've missed this report :(
OK so this seems that HUJI isn't working 100%
We nee to verify how to reproduce the problem : Some type of files or specific
courses;
I can't deal with a problem that I can't reproduce
Original comment by Wolf1...@gmail.com
on 29 Apr 2009 at 4:21
I have files missing, too.
I'm at BGU, not HUJI.
I don't know if it's the same issue, because I get a message at the end saying
that
some files were skipped and it also says "(Probably OK)", I'm not sure why.
When I try to DL them manually, it's fine.
The files were all PDFs, but other PDFs from the same course came down fine.
Even
from the same directory. I couldn't even find a pattern like "file name is in
Hebrew"
that differentiates these files.
I'm sorry I'm not more helpful. If you have any idea that you'd like to
investigate,
let me know.
Original comment by NoamNelke
on 23 May 2009 at 10:08
OK, this seems to be confirmed.
Also this problem seems to appear in TAU (during the last couple of months)
I can only deduce that there were some changes to the Highlearn system (a
change of
font, or a change of some words in the final HTML that holds the actual HTTP
links)
This isn't easy to fix, but this is what should be done:
1. find a course with enough files that don't behave
2. use an HTTP sniffer to record the final HTML that the miner-highlearn
downloads
from the server,
3. compare it to the final HTML that is downloaded manually using
internet-explorer
4.a. if the HTML files are the same - good , just fix the REGEX that finds the
actual
link
4.b if the files are different - damn, back-trace to the previous file
downloaded and
search for diffrences
i'm writing this because i have no intention to be fixing this any time soon,
i'm
finishing my studies this semester!!!! :)
if anyone is trying to do this, you're welcome to ask for advice using GTalk or
ICQ
Cheers
Original comment by Wolf1...@gmail.com
on 4 Jun 2009 at 9:40
Original issue reported on code.google.com by
Wolf1...@gmail.com
on 1 Mar 2009 at 11:12