kost / dvcs-ripper

Rip web accessible (distributed) version control systems: SVN/GIT/HG...
GNU General Public License v2.0
1.66k stars 309 forks source link

Errors while downloading #5

Closed SergeC closed 9 years ago

SergeC commented 9 years ago

../dvcs-ripper/rip-git.pl -s -v -u https://qwe.to/.git/ I got a bunch of other errors:

error: bad graft data: error: Could not read 324324erewrewr error: inflate: data stream error (incorrect header check) error: unable to unpack 3432432efrdsfsdf header error: inflate: data stream error (incorrect header check) fatal: loose object 343243erewrew (stored in .git/objects/01/wr324324sdf) is corrupt

kost commented 9 years ago

Is it possible to provide .git/objects/01/wr324324sdf file in order to inspect content?

SergeC commented 9 years ago

Can we continue discussion by email?

kost commented 9 years ago

Sure! send all the details to my gmail.com address.

On Fri, Jun 19, 2015 at 10:22 AM, SergeC notifications@github.com wrote:

Can we continue discussion by email?

— Reply to this email directly or view it on GitHub https://github.com/kost/dvcs-ripper/issues/5#issuecomment-113428715.

Vlatko Kosturjak, Kost

SergeC commented 9 years ago

Please let me know your email address since it is not available on your profile page.

kost commented 9 years ago

Just say "git log" in dvcs-ripper directory cloned from github

On Fri, Jun 19, 2015 at 11:30 AM, SergeC notifications@github.com wrote:

Please let me know your email address since it is not available on your profile page.

— Reply to this email directly or view it on GitHub https://github.com/kost/dvcs-ripper/issues/5#issuecomment-113445382.

Vlatko Kosturjak, Kost

SergeC commented 9 years ago

Did you got my email? If so please decide what to do.

kost commented 9 years ago

Problem with this issue is that probably that sites returns default page instead of 404. You must change source in order to recognize between 404 and 200 correctly.

For example, to recognize this problem, go to nonexistant URI: https://qwe.to/.git/this-file-does-not-exists

If you don't get 404, but 200 with some default output, that means you will have to change source of dvcs-ripper to recognize such case. Since invalid HTML is downloaded as git object.

Since you get 404 for some of the objects, another problem afterwards will be packs. See more about it here: https://github.com/kost/dvcs-ripper/issues/6

kost commented 9 years ago

BTW Last few commits implemented recognizing 404as200 pages: https://github.com/kost/dvcs-ripper/commit/b885f76dadc283a099d5372f0c5b0fb41005467d

So, I would recommend trying with newer version of dvcs-ripper: git pull