edsl / endefensadelsl.org

Nos mudamos https://0xacab.org/edsl
Other
4 stars 4 forks source link

Espejos de las referencias #13

Open mauriciopasquier opened 11 years ago

mauriciopasquier commented 11 years ago

Para que siempre tengamos alguna versión de los links disponibles, por si cambian, se caen, etc, podríamos espejarlas en nuestra página o subirlas a la wayback machine si no están ya. Podemos poner los links espejados en la versión en html.

fauno commented 11 years ago

+1 a la wayback, se pueden extraer con esto: grep -ro "https\?://[^'\"]\+"

fauno commented 11 years ago

eeeh eso es para extraerlos del html, habría que cambiar el grupo [^'\"] para incluir los cierres de markdown/bibtex

mauriciopasquier commented 10 years ago

Parece que no hay una forma de archivar los links que tenemos.

How can I get my site included in the Wayback Machine?

Much of our archived web data comes from our own crawls or from Alexa Internet's crawls. Neither organization has a "crawl my site now!" submission process. Internet Archive's crawls tend to find sites that are well linked from other sites. The best way to ensure that we find your web site is to make sure it is included in online directories and that similar/related sites link to you.

Alexa Internet uses its own methods to discover sites to crawl. It may be helpful to install the free Alexa toolbar and visit the site you want crawled to make sure they know about it.

Regardless of who is crawling the site, you should ensure that your site's 'robots.txt' rules and in-page META robots directives do not tell crawlers to avoid your site.

When a site is crawled, there is usually at least a 6-month lag, and sometimes as much as a 24-month lag, between the date that web pages are crawled and when they appear in the Wayback Machine.

In some cases, crawled content from certain projects may appear in a much shorter timeframe — as little as a few weeks from when it was crawled. Older material for the same pages and sites may still appear separately, months later.

fauno commented 10 years ago

Mauricio Pasquier Juan notifications@github.com writes:

Parece que no hay una forma de archivar los links que tenemos.

los archivamos nosotros entonces

:{

mauriciopasquier commented 10 years ago

un GET acá:

http://web.archive.org/save/<url>

parece que las archiva :D

fauno commented 9 years ago

otro: http://amberlink.org/

mauriciopasquier commented 9 years ago

Suena muy bien amber