rfcx / arbimon-uploader

Desktop application for ingesting audio to RFCx platform
Apache License 2.0
0 stars 0 forks source link

Check checksum issue between uploader and ingest service #215

Closed rassokhina-e closed 5 months ago

rassokhina-e commented 5 months ago

Original from @rassokhin-s https://rfcx.slack.com/archives/G018NMPCBHS/p1711100798915549

Image

rassokhina-e commented 5 months ago

This issue shows us the upload ID, so when I want to check any error related to the checksum mismatch I see nothing in the grafana log:

Image

Image

rassokhina-e commented 5 months ago

Original problem :

// Checks file metadata. Throws IngestionError if data is invalid
if (upload.checksum && upload.checksum !== meta.checksum) {
    throw new IngestionError('Checksum mismatch.', db.status.CHECKSUM)
}

upload id = 65fd51d23bdf6973f9f7d18a file checksum from db = 4b848e8c77412de247bbc22aa16db97caf5f68d1 file checksum from audio meta = b8ac3e27554f8d2cacf4ea9f210cb77c8958caa5

stream id from audio: vu171ayzkm23 stream id from db : hlcbw0daxt1e

audio meta = {"format":"flac","duration":60,"sampleCount":2880000,"channelLayout":"mono","channelCount":1,"bitRate":399277,"sampleRate":48000,"codec":"flac","tags":{"comment":"Recorded at 15:55:00 26/12/2023 (UTC-5) during deployment CA67479D277F247E at medium gain while battery was 4.2V and temperature was 26.7C.","artist":"AudioMoth 248D9B0260375875","encoder":"Lavf58.24.101"},"size":2994579,"checksum":"b8ac3e27554f8d2cacf4ea9f210cb77c8958caa5"}

Upload metadata from database {"_id":"65fd51d23bdf6973f9f7d18a","streamId":"hlcbw0daxt1e","userId":"d3e551f1-043a-45bf-aeab-cd41968e0c06","status":10,"timestamp":"2023-12-26T20:55:00.000Z","originalFilename":"20231226_155500.WAV","checksum":"4b848e8c77412de247bbc22aa16db97caf5f68d1","createdAt":"2024-03-22T09:39:30.949Z","updatedAt":"2024-03-22T09:39:56.465Z","__v":0}

Image

rassokhina-e commented 5 months ago

Audio meta keeps original deployment id and site related to this deployment, which is deleted by user deleted site id = vu171ayzkm23, also the second stream wasn't created in device db, so it looks like the user created it manually

Image

Image

rassokhina-e commented 5 months ago

device db hasn't any information about deleted device:

Image

rassokhina-e commented 5 months ago

The user created the site vu171ayzkm23 via the guardian app 15:12:2023 and manually created the second site hlcbw0daxt1e 20:12:2023, these sites have different names

this is uploading to site hlcbw0daxt1e with checksum 4b848e8c77412de247bbc22aa16db97caf5f68d1 and file name 20231226_155500.WAV

Image

then the user tries to upload a file with the same name 20231226_155500.WAV , but a different checksum to the same stream hlcbw0daxt1e

Image

naluinui commented 5 months ago

Hi @rassokhina-e, Thanks for looking into this.

From your comments: https://github.com/rfcx/arbimon-uploader/issues/215#issuecomment-2018041009 https://github.com/rfcx/arbimon-uploader/issues/215#issuecomment-2018044123

I saw that you find out the suspicious case on Device API -- when a site got removed from Core/Arbimon, Device API doesn't know. That case might cause confusion to the user if we still allow them to upload the file to a site that has been removed.

From https://github.com/rfcx/arbimon-uploader/issues/215#issuecomment-2018070482

For this case, I think we should allow the user to upload same file name to different sites.

This https://github.com/rfcx/arbimon-uploader/issues/215#issuecomment-2018030116

Does seem to relate to checksum problem, but I'm still couldn't get my head around it. Can you summarize what was causing the checksum problem, please?

rassokhina-e commented 5 months ago

Hi @naluinui , @koonchaya !

For this case, I think we should allow the user to upload same file name to different sites. Sorry, I put all details during the investigation process and looks like it isn't understandable completely :)

The issue occurs when the user is trying to upload the same name file, but a different checksum to the same site

Untitled
koonchaya commented 5 months ago

@rassokhina-e Is it possible that the checksum is incorrect? I think for the Audiomoth recordings there should not be duplicate checksum when user recorded continuously.

rassokhina-e commented 5 months ago

Is it possible that the checksum is incorrect

it is not possible, the checksums were correct, the user uploaded the recordings to the site, where the recording name already exists, with a different checksum, it means the user selected not correct site to upload

I think for the Audiomoth recordings there should not be duplicate checksum when user recorded continuously.

from the second recording we see the site related to this recording has different stream id, it means my feedback about selecting not correct site by the user is correct

Image

koonchaya commented 5 months ago

@rassokhina-e Is there any issue for checksum? What should we do next?

koonchaya commented 5 months ago

Can I close this ticket?