iScsc / iscsc.fr

The iScsc website, build with passion by wannabe devs 🔥
GNU General Public License v3.0
4 stars 12 forks source link

Close #14 : Unhandled Error when trying to upload an article that is too big #17

Closed atxr closed 2 years ago

atxr commented 2 years ago

I propose a fix for #14 It's just a maximum file size for the upload. The maximum size in 40kb at the moment, because I didn't have any idea. But I created a config file for the frontend and this value can be modified fastly in a json file.

atxr commented 2 years ago

I added comments in the json config file in c283b14. It might look a bit weird, but their is no comments in json files. I will wait for @ctmbl approving and merge then

atxr commented 2 years ago

Finally after some discussion with @ctmbl we will cancel these changes In this solution, I implemented a frontend solution do deal with the big files because I thought the error was from the frontend. However the backtrace shows that the error is thrown in the backend. I will then check the size on the backend also

atxr commented 2 years ago

The bug spotted by @gbrivady comes from the middleware express.json. This is a piece of code that runs each time between the frontend and backend and that parses the requests to json. I raise the maximum size of this middleware to 1MB in 4652cf1 The goal is to limit errors from this middleware and to try to catch big file errors in the frontend/backend For the frontend, nothing changed, same checks than before to deny big file reading. For the backend, I added in e1b47da a bunch of tests performed before the backend creates a new article in the database Let me know if it's clearer for you now! I will ask you to check these changes before merging. (I ping @ctmbl here to)

atxr commented 2 years ago

I have mainly two points:

  • try not to mess between bytes and characters! there are the same for ASCII but that's not true with UTF-8 or other UTF

I know but in the backend I check the character length for simplicity and in the frontend the file size

  • I don't get the point of having 3 different size limits... 1MB for the middleware , 3kb for the backend and 40kb in the frontend??

The 3kb in the backend is an error it should be 40kb And I don't really know how to deal with this. The middleware needs a file limit, otherwise it throws weird errors which are difficult to catch. I'll try something to catch them later, but at the moment, I put 1Mb in the middleware to trigger the backend/frontend error before Moreover, the middleware deals with the whole request so body + title + summary + author + headers + other things maybe So its limit can be a bit higher.

atxr commented 2 years ago

Alright! We need @ctmbl approval and we are good for closing this one!

atxr commented 2 years ago

The 3kb in the backend is an error it should be 40kb

Apart from this point which hasn't been modifed @atxr I think it's good to go

I spotted why I did that. The size in the frontend is in byte and in the backend in char (don't ask why...) So if it's utf-8 encoded, it's 4 bytes and 40kb is 10k chars. I fixed in 84ecb33 Should be ready to merge! Sorry for the delay :rocket:

ctmbl commented 2 years ago

@atxr Actually I think there is a slight misunderstanding here, even if I'm fine with 10k chars as a limit for he backend.

UTF-8 encoding hasn't a constant size in Bytes for char as you can read on the wikipedia page or in either of these StackOverflow answers: 1 2. By the way that's the reason why UTF-8 is so powerfull! It is ASCII transparent and then really light for common characters but can also with heavier characters (up to 4 Bytes = 4Octets = 32 bits) encode less common characters while ASCII can't. UTF-32 on the other side is 4 Bytes constant for each character but isn't widely used because it's quite heavy...

Then I would say that you can't guess what will be the size in Bytes of 10k UTF-8 characters but 10k seems great because you can't set the perfect value :confused:

atxr commented 2 years ago

Mmm yes indeed I already read about that I forgot :thinking: If we assume the worst case then, with only heavy 4 bytes characters, 10k chars = 40kb So with this 10k chars limits we ensure at most 40kb of data :thinking:

atxr commented 2 years ago

I put 40k everywhere after a private discussion with @ctmbl !