rh-lab-q / deltametadata-prototype

2 stars 5 forks source link

Prepare presentation charts #17

Closed mluscon closed 8 years ago

mluscon commented 8 years ago
onionka commented 8 years ago

Simulation program is implemented with a few TODOs (this features are not required)

But data are repetitive:

LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/7e1d047406ddd03d3998035ee8e1fd902f0c07107c79e3e5b41d53c10cbfb3a8-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/b642bdf20b5848f56d29f5ce0fc5830d4ee79b897dfabe8317995f77637b969e-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/c6a119ac04415a7b4021fba4babc1f4738edc2f52a03e355957256b0f6a5ffa8-filelists.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/c6a119ac04415a7b4021fba4babc1f4738edc2f52a03e355957256b0f6a5ffa8-filelists.xml.gz')]
LOG: []
LOG: []
LOG: [('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/2443f4292062683297b8cce9fd8ca01f3d4a58156f459f4718d2ab4ac591add6-primary.xml.gz')]
LOG: [('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/2443f4292062683297b8cce9fd8ca01f3d4a58156f459f4718d2ab4ac591add6-primary.xml.gz')]
LOG: [('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/2443f4292062683297b8cce9fd8ca01f3d4a58156f459f4718d2ab4ac591add6-primary.xml.gz')]
LOG: []
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/98f5da349c2347e914c11aa4fb56bcc639105407388eb4e048a2e11bc4a27d77-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/ddf2fd6a04c0c6b25a075c26bc3eb7706db889eb5558db735809fbb30fbd9609-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/6237b09004ebd7f25972c5499a56a34c65ae9d76f2288c50ad26d9e32d0dc4e5-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/28661a5363df5e4ffeddcc4733264655c438a705809912230fdfabae86f4130c-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/6237b09004ebd7f25972c5499a56a34c65ae9d76f2288c50ad26d9e32d0dc4e5-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/28661a5363df5e4ffeddcc4733264655c438a705809912230fdfabae86f4130c-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/34d2dbd05646c26f93c3c903f4fe3f3a0e048ccca1c660b56b3c1ebfe52590ae-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/443d37b09ac0e6676ddd44715eeaabb462fc4c5662089dd57f45f27edc01d487-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/34d2dbd05646c26f93c3c903f4fe3f3a0e048ccca1c660b56b3c1ebfe52590ae-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/443d37b09ac0e6676ddd44715eeaabb462fc4c5662089dd57f45f27edc01d487-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/34d2dbd05646c26f93c3c903f4fe3f3a0e048ccca1c660b56b3c1ebfe52590ae-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/443d37b09ac0e6676ddd44715eeaabb462fc4c5662089dd57f45f27edc01d487-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/4c7195e2dfdd3919f458a1e994ea6f31d97331d5f908f93e49392d74ffcd24bc-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/382291ae1d150f98e250c4b518b98e73f9c4ca5125d224931c36c54b3efba67d-primary.xml.gz')]
LOG: [('3d4480621908348d8fbd18db4dde6c29b82ae752ff768ee15679eec3e578d692-filelists.xml.gz', './/cache//repodata/4c7195e2dfdd3919f458a1e994ea6f31d97331d5f908f93e49392d74ffcd24bc-filelists.xml.gz'), ('56965fe0a6b1c0aa4bf57f474efb61838a2488f29f660c7b3c002413496b4a71-primary.xml.gz', './/cache//repodata/382291ae1d150f98e250c4b518b98e73f9c4ca5125d224931c36c54b3efba67d-primary.xml.gz')]
LOG: []

You can see logs in which program dumps array of tuples containing (downloaded_file_path, synchronised_file_path), sometimes zsync downloads 2 files (max), if one is exactly same or is missing, it downloads 1 file and if both are the same or missing, there is empty array dumped.

On the other hand, we have statistics in which zsync always downloads something even when files have the same hash (expecting that the file are the same when hashes are the same)

here is example of output:

From        To          Downloaded  Synchronized    Downloaded/Synchronized Downloaded  Synchronized    Downloaded/Synchronized
20160601    20160602    15811784    39606           0.0025048406934979634   5511490     39025           0.007080662398008524
20160602    20160602    15817452    333522          0.0210856969883645
20160602    20160603    15849370    0               0.0
20160603    20160605                                                        5161918     108440          0.021007695201667287
20160605    20160605                                                        5542049     0               0.0
20160605    20160606                                                        5542049     0               0.0
20160606    20160609    14910921    520430          0.0349026059490222      5542049     189420          0.03417869455863707
20160609    20160610    15899912    128312          0.008069981770968291    5559036     50725           0.009124783505629393
20160610    20160610    15924417    0               0.0                     5565712     0               0.0
20160610    20160611    15924417    32233           0.0020241243368595537   5565712     41246           0.007410731996193839
20160611    20160611    15927296    0               0.0                     5568333     0               0.0
20160611    20160613    15927296    0               0.0                     5568333     0               0.0
20160613    20160613    15927296    339549          0.021318684602835283    5568333     153737          0.027609160587199078
20160613    20160614    15942325    0               0.0                     5575525     0               0.0
20160614    20160615    15942325    28979           0.0018177398842389677   5575525     31350           0.005622788885351604
20160615    20160615    15948894    35472           0.0022241040664010936   5581552     44269           0.007931306561329178
20160615    20160617    15953536    0               0.0                     5584160     0               0.0
20160617    20160618    15953536    357248          0.022393029357253465    5584160     97894           0.017530658147331023
20160618    20160618    15957092    72760           0.004559728050699965    5587975     39670           0.007099172777258309
20160618    20160619    16000133    0               0.0                     5594612     0               0.0
20160619    20160619                                                        5594612     68145           0.012180469351583274
mluscon commented 8 years ago

Are you considering repomd.xml in your stats? I don't see it in the logs.

I am not sure whether I understand your concerns regarding data repetition. It's completely legit if repomd.xml differs between days only in timestamp and primary and filelists are the same.

LukasSlouka commented 8 years ago

Can you modify output a bit? Plotting script wont be modifying values, it can only work with what it gets.

thanks

mluscon commented 8 years ago

What is up with empty values?

I presume it was some issue with availability of openshift or that particular mirror we used. Do we have logs from openshift @PavolVican?