Open samsamros opened 10 months ago
Testing async file on its own I had the following:
import std/[asyncfile, asyncdispatch, os]
const target = "path/files"
var f = openAsync(target, fmRead)
let
fsize = getFileSize f
chunkSize = 261120
for _ in countup(0, int(fsize-1), chunkSize):
let data = waitfor f.read(chunkSize) # to make it work regardless sync/async because asyncfile
echo data.len
the output:
261120 261120 261120 ... 261120 261120 65985
Switching chunkSize to a megabyte (1048576):
1048576 1048576 1048576 ... 204225
Works as expected.
The following code, run independently from the test works fine:
import std/[asyncfile, asyncdispatch, os]
import anonimongo
const
anoSocketSync* = defined(anoSocketSync)
URI = "mongodb://localhost:27017/"
saveas = "/home/testing/mongodb-mongosh_1.6.0_amd64.deb"
target = "/home/testing/Downloads/mongodb-mongosh_1.6.0_amd64.deb"
when not anoSocketSync:
type TheSock* = AsyncSocket
else:
type TheSock* = Socket
var mongo = newMongo[AsyncSocket](MongoUri URI)
var grid: GridFS[TheSock]
var db: Database[TheSock]
var wr: WriteResult
let dbname = "newtemptest"
var f = openAsync(target, fmRead)
let
fsize = getFileSize f
chunkSize = 1048576
for _ in countup(0, int(fsize-1), chunkSize):
let data = waitfor f.read(chunkSize) # to make it work regardless sync/async because asyncfile
echo data.len
assert waitFor mongo.connect
db = mongo[dbname]
grid = waitfor db.createBucket(chunkSize = 1.megabytes.int32)
wr = waitfor grid.uploadFile(target)
echo wr.success
wr = waitfor grid.downloadFile(saveas)
echo wr.success
Is there an enforced limit on buffer size in the tests? Checking this out next
So, I finally made it work. However, there are a few limitations to the test.
First, I changed a few check
macros to doassert
; Secondly, I changed all streams above 131072 to 131072. There appears to be a buffer limitation inside the test
template. I cannot confirm it yet, but all tests outside it work fine.
Considering the above limitations, I'd like to propose tests outside the test
template to run streams above 131072, such as the 5.megabyte stream that is currently in the gridFS test right now.
I'm attaching a diff file with the changes I made to make the test work, pending all of y'alls review: gridfs.diff.zip
In addition, I'm sharing the diffs here to improve readability and searchability:
diff --git a/tests/test_gridfs_test.nim b/tests/test_gridfs_test.nim
index 112903b..53db33f 100644
--- a/tests/test_gridfs_test.nim
+++ b/tests/test_gridfs_test.nim
@@ -67,9 +67,9 @@ if filename != "" and saveas != "":
test "Create default bucket":
require db != nil
when anoSocketSync:
- grid = db.createBucket(chunkSize = 1.megabytes.int32)
+ grid = db.createBucket(chunkSize = 131072)
else:
- grid = waitfor db.createBucket(chunkSize = 1.megabytes.int32)
+ grid = waitfor db.createBucket(chunkSize = 131072)
require grid != nil
test "Upload file":
@@ -106,8 +106,9 @@ if filename != "" and saveas != "":
var gf = grid.getStream(bson({filename: dwfile}), buffered = true)
else:
var gf = waitfor grid.getStream(bson({filename: dwfile}), buffered = true)
- check f.getFileSize == gf.fileSize
- check f.getFilePos == gf.getPosition
+ doassert f.getFileSize == gf.fileSize
+ doassert f.getFilePos == gf.getPosition
+
let threekb = 3.kilobytes
when anoSocketSync:
@@ -115,28 +116,32 @@ if filename != "" and saveas != "":
else:
var binread = waitfor gf.read(threekb)
var bufread = waitfor f.read(threekb)
- check bufread == binread
- check f.getFilePos == gf.getPosition
+ doassert bufread == binread
+ doassert f.getFilePos == gf.getPosition
+
- let fivemb = 5.megabytes
- f.setFilePos fivemb
+ let kbStream = 131072 #changed
+ f.setFilePos kbStream
when anoSocketSync:
- gf.setPosition fivemb
+ gf.setPosition kbStream
else:
- waitfor gf.setPosition fivemb
- check f.getFilePos == gf.getPosition
-
+ waitfor gf.setPosition kbStream
+ doassert f.getFilePos == gf.getPosition
+
+
when anoSocketSync:
- binread = gf.read(fivemb)
+ #it can't be fivemb because of the buffer limit in tests
+ binread = gf.read(kbStream)
else:
- binread = waitfor gf.read(fivemb)
- bufread = waitfor f.read(fivemb)
- check bufread.len == binread.len
- check bufread == binread
- check f.getFilePos == gf.getPosition
+ binread = waitfor gf.read(kbStream)
+ bufread = waitfor f.read(kbStream)
+ doassert bufread.len == binread.len
+ doassert bufread == binread
+ doassert f.getFilePos == gf.getPosition
close f
close gf
+
test "Gridstream read chunked size":
let chunkfile = "gs_chunks.mkv"
@@ -246,4 +251,4 @@ if filename != "" and saveas != "":
close mongo
if runlocal:
if mongorun != nil and mongorun.running: kill mongorun
- close mongorun
\ No newline at end of file
+ close mongorun
Considerations:
test
template with larger streams. According to the comment above, gridfs is working properly. I do, however, think we should devise a strategy to run these tests outside the test
template. However, we should keep the current test, and include these changes if approved.chunksize
and it was being called chunkSize
. So to standardize an improve readability I changed all to chunksize
.@mashingan I've tested this with mongodb v7.0... all tests pass in debian. I will begin testing on a few applications. But I think anonimongo is fully compatible with 7.0. Are you ok with me pushing a branch with the changes I made to the GridFS test or would you like to review the diff file first?
Please go ahead to proceed. Appreciated for the efforts.
As promised, I will begin this thread as I try to debug the issues in the GridFS test. I'm also testing mongodb v 7.0. So far CRUD operations work fine without enabling full v 7.0 compatibility. I won't upgrade fully to 7.0 until we've run over these issues! Also, tests are still not fully operational in mongoo v 7.0. So here we go!
compiler: nim v 2.0
Config:
First, I tried uploading a small file around 1 kilobytes, which gave me an IndexDefect error, as part of the tests use 5 megabytes as part of the overall test.
See lines 121-122:
So, I used a 15.2 MB file. The first error showed the following:
Inspecting the resulting file
saveas
I verified it had only downloaded 2 MB, confirmed by the following error:I checked with pymongo and python's gridfs driver and got the following:
Suggesting this was happening from the get go, when uploading.
I tried uploading a 80 MB file, only 32MB got uploaded, but no error arised.
I began monitoring line 122 in
gridfs.nim
:which output
Even when hardcoding chunks to 261120, the default for GridFS according to MongoDB documentation, it did not work. When I deliberately switched the test to:
The test ran perfectly. Analyzing further, I hardcoded
1048576
in line 122 ingridfs.nim
just to test what was happening:this was the output:
Then I hardcoded this in line 122:
I got the following for data.len:
edit May this be a bug in the nim std library? or a protection against buffer overflows? It's not, works as expected, see comment below
While I have more time to devote to this error. I'm just pointing out this change I made that made the test work:
edit I think this may be a bug in the std library. It's not, see comment below.
We can run more tests. I will continue through this week to find a solution to the issue. There other issues when uploading other files. I tried uploading a .deb file just for kicks. It uploads alright, but overflows in the remaining tests, and fails. I can download it from gridFS using Python without issues, but there's a point in the test this breaks. Once we figure out what's happening with the 131072 value, we can move with the rest of the tests!
This is just one step at debugging this, and I hope it helps.