A script to find and download PDF publications from www.icann.org
GNU General Public License v3.0
3
stars
2
forks
source link
Proposing two scripts, one to generate txts from PDFs and another to create a tsv file compatible with https://cloud.google.com/storage-transfer/docs/create-url-list from all or a subset of files. #9
I have used these to fine tune BLOOM and also have generated ada embeddings for all of these, linked to all RFCs. Working on a query iface as we speak. Will later include scripts which scrape mailing lists archives and will embed and index them properly. Will share the rest asap.
I have used these to fine tune BLOOM and also have generated ada embeddings for all of these, linked to all RFCs. Working on a query iface as we speak. Will later include scripts which scrape mailing lists archives and will embed and index them properly. Will share the rest asap.