hasufell / portage-gentoo-git-config

Configuring portage to use the gentoo git mirrors and update relevant metadata
36 stars 13 forks source link

Avoid using rsync #15

Open hasufell opened 8 years ago

hasufell commented 8 years ago

And use e.g. https://github.com/gentoo-mirror/gentoo for initial cache update.

hasufell commented 8 years ago

maybe @mgorny has an idea?

Is there a way to get the md5-cache subdir of that mirror painlessly? sparse checkout also fetches everything afaik.

redneb commented 8 years ago

BTW, I've started using https://github.com/gentoo-mirror/gentoo now that it includes md5-cache and it works just fine. It includes everything except the herds.xml file (the absence of which has not caused any problems so far) and therefore it does not require any postsync scripts.

hasufell commented 8 years ago

That mirror isn't useful for developers and anyone who wants to contribute to the repository however, since it contains modified history.

junghans commented 8 years ago

I was thinking along the lines of

wget -r --no-parent https://raw.githubusercontent.com/gentoo-mirror/gentoo/master/metadata/md5-cache/

or something like this.

hasufell commented 8 years ago

mh yeah, but the timestamp looks weird https://github.com/gentoo-mirror/gentoo/blob/master/metadata/timestamp.chk

mgorny commented 8 years ago

@junghans, please don't do stuff like this. This would be serious abuse of github.

@hasufell, since the modifications are committed as merge commits, the repository contains the original history as well. In fact, I could even modify the scripts to run a second branch with original history.

hasufell commented 8 years ago

This would be serious abuse of github.

err wat? are you from github staff? :P

In fact, I could even modify the scripts to run a second branch with original history.

I'm not sure how the workflow would look like exactly. Sounds like I'd have to switch branches for development and then lose the cache etc.

However, for non-development purposes, this probably makes sense.

kentfredric commented 7 years ago

err wat? are you from github staff? :P

Github have documented (somewhere) not to over-use raw links, they're intended for use of "user-to-github", not "audience-on-developer-platform-to-github", because its a serious performance penalty for github to dispatch content straight from git.

And they've suggested/stated if traffic gets a bit warm to a raw url, it might start 404ing, and they'd rather people in charge of repos used github-pages for such a thing ( which checks-out-on-push, and then fetch is just a read-from-disk )

Fortunately, I believe you can now state an arbitrary branch to use for github-pages, and it would then turn up at:

But I don't know how their page-building engine would handle that either, there could be a notable lag between push, and the content being available.