MetPX / wmo_mesh

minimal sample to demonstrate mesh network with a pub/sub message passing protocol (mqtt in this case.)
GNU General Public License v2.0
4 stars 2 forks source link

a header that indicates the size of a product being announced. #7

Open petersilva opened 5 years ago

petersilva commented 5 years ago

The header giving file size was one of those stripped from Sarracenia to make the example, as it was unnecessary to make the protocol work. ETCTS wanted a header to indicate the size of products being advertised. The sarracenia has a header that gives this information, called parts (for partition strategy) which is fully documented. Changing the logic of that header is relatively difficult, so it should not be changed lightly. Putting a place holder here to track a discussion.

petersilva commented 5 years ago

People should read about the existing header here:

https://github.com/MetPX/sarracenia/blob/master/doc/sr_postv3.7.rst#the-fixed-fields

The parts field supports replication of partitioned files where the original file is either availble at source in one file, or only a part of it is available (on an intermediate server.) This is used to transfer very large files while preventing the capybara in the anaconda effect, creating large lumps on the intervening data pumps. the block size is programmable using the same field.

petersilva commented 5 years ago

v02 format: parts="<method>,<bsz>,<blktot>,<brem>,<bno>"

method=1|p|i


* 1 - a complete file announced as one piece. In this case, the bsz gives the size of the entire file.
  blktot=1,brem=0,bno=1  

* i - inplace strategy, the announcement describes a portion of a file.  so on the host advertising a 
  file, The entire file will be available for download, but the announcement only deals with a subset
  of the file.  For example, say the file is 5000 bytes  long and the blocksize is 1024, then five 
  portions of the file would be advertised:

       i,1024,5,904,0 
       i,1024,5,904,1
       i,1024,5,904,2
       i,1024,5,904,3
       i,1024,5,904,4

* p - partitioned strategy, in this case, the file to retrieve, intead of being one large file there will
  be five partitions on the server, named &lt;filename&gt;,Part,1024,5,904,0  ... 

In either case, the size of the file is blksiz (blktot-1) + brem = 10244 + 904 = 5000.

V03 Proposal

In use, the vast majority of files are sent as single part files (strategy: 1), so it would be reasonable to abstract out the parts header as:

size=5000 and nothing else. Only in the case of a partitioned file, would we need to add another header, perhaps something like:

"size": 904,

"blocks" :  
  { 
       "method": "inplace" | "partitioned", 
       "size": 1024, 
       "count": 5,  
       "remainder": 904, 
       "number" : 4 
 }

so for partitioned files, there would be an extra header. The question is, can we keep the file name suffixes the same as v02, as having JSON in filenames would be ungainly. Probably better to use the Part,1024,5,904,0 file suffix as-is. What do people think ?

benlapETS commented 5 years ago

I see 2 thing that doesnt make it:

petersilva commented 5 years ago

for '1' there is no 'block' header... only 'size', so there is no brem. in v02... there was a choice. It could have been brem... we just picked blocksize ('1' means that there is one block in the file, and the size of that block could be given by either field...) yes, using brem would have been more consistent with the other strategies, but then we would have had a 0 sized block... which seemed odd. it was a choice for v02. for v03, you don't need to make any such choice because for '1', there is only 'size' which is a distinct improvement.

petersilva commented 5 years ago

second comment: yeah double typo. fixed now, added 3, and removed 5.

petersilva commented 5 years ago

This proposal is now implemented in Sarracenia v2.19.03b1.