airdcpp-web / airdcpp-webclient

Communal peer-to-peer file sharing application for file servers/NAS devices
https://airdcpp-web.github.io
171 stars 31 forks source link

What needs to be the same to transfer hashing data from one server to another? #437

Closed beardedfool closed 1 year ago

beardedfool commented 1 year ago

Trying to sense check something before I lose myself down a rabbit hole please.

Server 1 runs airdc, large share is on remote server 2 over sshfs

I believe first initial hashing would be costly/ laborious over a slow connection.

Questions: 1) Is it possible to do the hashing on server 2 for the files there and transfer it over to server 1 assuming same filesystems. Or is there

2) I'm presuming here it's a one time transfer of hashing data and a temporary setup of airdc on server 2. Q. Am I right to assuming that transferring partial hashing data is non trivial

3) What needs to be the same to do the transfer between servers? e.g. match paths (get around this with links)

4) Is my initial assumption even correct, hashing over sshfs would be costly?

I've also have in mind migration to new servers/ disks whilst asking this.

Huge thanks for looking at this

maksis commented 1 year ago

Is it possible to do the hashing on server 2 for the files there and transfer it over to server 1 assuming same filesystems. Or is there

What needs to be the same to do the transfer between servers? e.g. match paths (get around this with links)

As long as the last modification date and path stay the same for each file, you should be fine. I think that symlinks will work but you might want to test it first with smaller directories before hashing your whole share.

Am I right to assuming that transferring partial hashing data is non trivial

You should replace the whole hash database, it's not possible to transfer partial data (or it definitely won't be easy)

Is my initial assumption even correct, hashing over sshfs would be costly?

I'm not familiar with sshfs but I assume that the speed of your connection is more relevant here

beardedfool commented 1 year ago

Thanks for coming back, suspect you've answered this multiple times on the forums so thanks for the patience. They are permanenly down, right? It's not just me.

Before I leave you in peace., two last questiona please.

5) If the path does change, is there anyway to edit that in the db somehow, or is it used in the hashing process. i.e. does changing the path always necessitate a new hash. I presume so from your answer above.

6) Does airdc++ do anything differently in the hashing to DC++ i.e. if I read documentation on that should it largely the same, are any differences noted anywhere? (Though guess the code is the real source of information)

Happy holidays!

maksis commented 1 year ago

They are permanenly down, right?

That's quite possible...

If the path does change, is there anyway to edit that in the db somehow, or is it used in the hashing process. i.e. does changing the path always necessitate a new hash. I presume so from your answer above.

The only supported way is to use the API (you'll probably need to write a script for this): https://airdcpp.docs.apiary.io/#reference/hashing/methods/rename-path

Does airdc++ do anything differently in the hashing to DC++ i.e. if I read documentation on that should it largely the same, are any differences noted anywhere? (Though guess the code is the real source of information)

Yes, the implementation (including the database format) is different in AirDC++

beardedfool commented 1 year ago

Thanks. Hugely appreciated. I'll come back if I come back with anything useful but close for now.