cellarium-ai / cellarium-cas

Python client libraries for Cellarium Cloud Cell Annotation Service (CAS).
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

large submission handling #100

Open charlottedibiase opened 1 week ago

charlottedibiase commented 1 week ago

I am trying the submit about 32500 cells to your public beta, but have not been able to do so. I have received the following error messages

I removed chunk 66 (the final chunk), only to receive this error

Despite the cell number being within the 50,000 cell weekly quota. Let me know what I can do to move forward!

mbabadi commented 1 week ago

Hi @charlottedibiase, thank you for submitting this issue. Would it be possible for you to share your AnnData with a member of our team for further investigation? At the same time, @fedorgrab @KevinCLydon, could you please look at the sentry logs and investigate? Thanks.

mbabadi commented 1 week ago

@charlottedibiase, meanwhile, I increased your quota so that you can continue your experimentation. Please consider submitting smaller chunks (e.g. the first 5,000 cells) and see what happens. Keep us posted.

charlottedibiase commented 6 days ago

Hi @mbabadi, thanks for your response. When I tried to submit a 5000 cell subset I got the same 'Request Entity Too Large' error. 5000 cell AnnData file

fedorgrab commented 4 days ago

Hi @charlottedibiase,

Thank you for reaching out to us and participating in our Beta release! I've reviewed your data, and it seems you are submitting normalized data, which is likely the cause of the errors. CAS only accepts raw counts and performs all the necessary data normalization under the hood.

This is a great use case for us, as it would be helpful for CAS to notify users when normalized data is submitted. I've created an issue for this validation check (#103).

Please let us know if you’ve tried using CAS with raw counts and whether it works as expected.

fedorgrab commented 4 days ago

@charlottedibiase

Side note: I suspect we may be encountering the 413 Request Entity Too Large error because normalized data chunks are often larger than those with raw counts. This might vary depending on the normalization strategy. If, in your case, all zero values are converted to non-zero values, the chunk size could increase significantly.