guerrerocarlos / CacheP2P

"More users = More capacity"
http://www.cachep2p.com/
MIT License
865 stars 52 forks source link

How to automatically create cachep2p.security.js? #26

Open extensionsapp opened 7 years ago

extensionsapp commented 7 years ago

Hello.

How to automatically create cachep2p.security.js?

I have a sitemap and on the basis of these data create cachep2p.security.js.

How can I do that?

guerrerocarlos commented 7 years ago

Hello @extensionsapp

Currently there is no way to automatically create the cachep2p.security.js but it could be easily achieved by reading your sitemap, getting each url content using a library like request and then using simple-sha1 to obtain the security hash of each URL and saving them into the cachep2p.security.js file.

extensionsapp commented 7 years ago

Hash from the URL or from the site content?

sha1('https://hello.com/world', function (hash) {
  console.log(hash)
  > 6b1c01703b68cf9b35ab049385900b5c428651b6
})

OR

sha1('<!DOCTYPE html><html lang="en"><head> ...', function (hash) {
  console.log(hash)
  > 6b1c01703b68cf9b35ab049385900b5c428651b6
})
guerrerocarlos commented 7 years ago

The content of each url (second option)

extensionsapp commented 7 years ago

Thank you, Carlos. I'm going to try now.

extensionsapp commented 7 years ago

I use ExpressJS.

...
                    html = '' +
                        'Hello World' +
                        '<script type="text/javascript" src="/js/cachep2p.min.js">' +
                        '</script><script type="text/javascript" src="/js/init.js"></script>';

                    console.log('crypto: ' + crypto.createHash('sha1').update(html).digest('hex'));

                    res.send(html);

                    console.log('crypto: ' + crypto.createHash('sha1').update(html).digest('hex'));
...

Console.log:

crypto: 9b7f042794922f59decf62221dde4dca0b18524e
crypto: 9b7f042794922f59decf62221dde4dca0b18524e

Console browser:

screenshot_2

Why another hash?

guerrerocarlos commented 7 years ago

Compare document.documentElement.innerHTML in the browser with what you have in ExpressJS

extensionsapp commented 7 years ago

Browser auto add <head></head><body>...</body>

Ohh, I have many elements on website. Every restart - new hash. My code on browser and server != code on document.documentElement.innerHTML.

Browser/server:

<!DOCTYPE html>
<html lang=en>
<head>
<meta http-equiv=content-type content="text/html; charset=utf-8">
<meta name=viewport content="width=device-width,initial-scale=1">

innerHTML:

<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">

I use https://github.com/kangax/html-minifier

Can I switch to creating a hash based on the URL?

extensionsapp commented 7 years ago

Every restart - new hash.

The social network widget automatically changes to a random value.

screenshot_2

But even if I refuse the widget, the problem is that the code is different (browser code != innerHTML). It is necessary to generate a hash based on, for example, URL.

extensionsapp commented 7 years ago

Can I pass the hash of the page on the server in the tag? <meta name="cachep2p" content="9b7f042794922f59decf62221dde4dca0b18524e">

And document.head.querySelector('[name="cachep2p"]').content; instead document.documentElement.innerHTML.

extensionsapp commented 7 years ago

Oh, I think I realized that nonsense said. The hash is created from the data. In the end, I think the technology is not yet very ready.

guerrerocarlos commented 7 years ago

Ignoring the social network, the only difference is the "<!DOCTYPE html>" ?

extensionsapp commented 7 years ago

No, many removed - comments, whitespace, quotes, min css, min js, etc.

https://github.com/kangax/html-minifier

                        minify(html, {
                            removeComments: true,
                            removeCommentsFromCDATA: true,
                            collapseWhitespace: true,
                            collapseBooleanAttributes: true,
                            removeRedundantAttributes: true,
                            useShortDoctype: true,
                            removeAttributeQuotes: true,
                            removeEmptyAttributes: true,
                            minifyCSS: true,
                            minifyJS: true
                        });
guerrerocarlos commented 7 years ago

And can't you hash the result from that minified text instead?

extensionsapp commented 7 years ago
screenshot_3

Server code:

<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><title>Hello World</title></head><body><div class=""><p id="hello">world</p><span style="color:#000">Hello World</span></div><script src="//unpkg.com/cachep2p/cachep2p.min.js"></script><script src="/files/cachep2p.security.js"></script><script>var cachep2p = new CacheP2P;</script><script>console.log(document.documentElement.innerHTML);</script></body></html>

4bf59c219f7eb1278b15b82d25490e1130a1a7ff

Min server code === browser code (Ctrl+U - view-source:https://...):

<!DOCTYPE html><html lang=en><head><meta charset=UTF-8><title>Hello World</title></head><body><div><p id=hello>world</p><span style=color:#000>Hello World</span></div><script src=//unpkg.com/cachep2p/cachep2p.min.js></script><script src=/files/cachep2p.security.js></script><script>var cachep2p=new CacheP2P</script><script>console.log(document.documentElement.innerHTML)</script></body></html>

c8e2912b026c5ccf34ba00339b3fe45ff0189a85

screenshot_4

browser code !== document.documentElement.innerHTML

<head><meta charset="UTF-8"><title>Hello World</title></head><body><div><p id="hello">world</p><span style="color:#000">Hello World</span></div><script src="//unpkg.com/cachep2p/cachep2p.min.js"></script><script src="/files/cachep2p.security.js"></script><script>var cachep2p=new CacheP2P</script><script>console.log(document.documentElement.innerHTML)</script></body>

2d78c678d2b4e2a7b5833ad8209151a39adcc5ae

extensionsapp commented 7 years ago

And if create hash from RequestJS - RequestJS code === browser code === min server code !== document.documentElement.innerHTML:

var request = require('request');
request('http://hello.com/', function (error, response, body) {
  console.log('error:', error); // Print the error if one occurred
  console.log('statusCode:', response && response.statusCode); // Print the response status $
  console.log('body:', body); // Print the HTML for the Google homepage.
});

Result:

error: null
statusCode: 200
body: <!DOCTYPE html><html lang=en><head><meta charset=UTF-8><title>Hello World</title></head><body><div><p id=hello>world</p><span style=color:#000>Hello World</span></div><script src=//unpkg.com/cachep2p/cachep2p.min.js></script><script src=/files/cachep2p.security.js></script><script>var cachep2p=new CacheP2P</script><script>console.log(document.documentElement.innerHTML)</script></body></html>

c8e2912b026c5ccf34ba00339b3fe45ff0189a85

guerrerocarlos commented 7 years ago

I understand, I guess that one solution would be to use cheerio to get the same content of document.documentElement.innerHTML in the server side

extensionsapp commented 7 years ago

Good idea, @guerrerocarlos

var html = '...';
var cheerio = require('cheerio')
var $ = cheerio.load(html);
html = $('html').html();
6aco9rl0shexbrrpdxh1vw

Do I have a bug in the browser? How can I check if the cache is working? With each page update, the site accesses the server.

And I just noticed. Hash in Opera browser is different.

extensionsapp commented 7 years ago

Oh, even having the same HTML-code in the browser and on the server, the hash is different.

screenshot_9

console.log(document.documentElement.innerHTML) === $('html').html() [CacheP2P] this page's security hash: 89c4ff2112efe2660b09e4881989237dcba2b88f !== 6fc5e2c1af77ee5e511b729493c29fab50bb0bb9

This code console.log(document.documentElement.innerHTML) - https://pastebin.com/raw/ucxXtciA

screenshot_10
extensionsapp commented 7 years ago

Are there any thoughts about this error?