benfortuna / newsagent

An RSS aggregation library
Other
0 stars 1 forks source link

Implement path/uid translation from url #1

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Generate a unique path for each feed / entry link. eg:

import java.security.MessageDigest

import org.bouncycastle.util.encoders.Hex

def url2uid = { url ->
    def uid = url.host.split('\\s*\\.\\s*').reverse() as List
    uid.addAll url.path.split('\\s*/\\s*').findAll { !it.empty }
    def digest = MessageDigest.getInstance('md5')
    def checksum = digest.digest url.toString().bytes
    uid << new String(Hex.encode(checksum))
}

URL url = 
['http://games.slashdot.org/story/11/11/25/2217247/valves-gabe-newell-on-piracy-
its-not-a-pricing-problem?utm_source=rss1.0mainlinkanon&utm_medium=feed']

println url2uid(url)

Original issue reported on code.google.com by benfortuna on 29 Nov 2011 at 2:50

GoogleCodeExporter commented 9 years ago
For byte[]:

import java.security.MessageDigest;

import org.bouncycastle.util.encoders.Hex;

def bytes2path = { bytes ->
    def digest = MessageDigest.getInstance('md5')
    def checksum = new String(Hex.encode(digest.digest(bytes)))

    def path = checksum.split(/(?<=\G.{2})/) as List
    path << checksum
}

byte[] bytes = new File('src/main/groovy/bytes.groovy').bytes

println bytes2path(bytes)

Original comment by benfortuna on 29 Nov 2011 at 11:20