IGS / gEAR

The gEAR Portal was created as a data archive and viewer for gene expression data including microarrays, bulk RNA-Seq, single-cell RNA-Seq and more.
https://umgear.org
GNU Affero General Public License v3.0
10 stars 5 forks source link

High resolution .png download for static tSNE/MAP scatter plots #667

Closed carlocolantuoni closed 3 weeks ago

carlocolantuoni commented 2 months ago

@adkinsrs let me know that for static scatter plots, there is already a low res version rendered and a higher res version downloaded. for several of the very large scRNA-seq and spatial RNA-seq dataset, we need more resolition than the current .png's in order to see the order in the images. whats the easiest solution here? adding options to change this resolution manually would be great, but might be more work and add complexity - can we simply further increase the resolution of the downloaded version? whats the best solution here?

adkinsrs commented 2 months ago

This will be done in the "ui-v2" branch, though I think it's a simple enough addition to hardcode on the v1 production code directly.

I think the way I want to approach this is to just dynamically adjust the resolution (dpi) based on the current number of cells being displayed. I do not think it is a good option to add in the curator since it directly does not affect the way a plot looks (the actual curation) and I cannot see many users using a manual adjustment.

adkinsrs commented 2 months ago

dpi150 dpi750

These two images are of the same dataset (~150000 cells), but the top one is 150 dpi and the bottom one is around 750 dpi (formula int(150 + (#num_filtered_cells / 500))). if you save each image and zoom into each you can see the differences in detail. @carlocolantuoni is 750 dpi enough for this dataset?

One issue I encountered with changing the "save_dpi" value is that internally, the plot is saved as a bytestream and sent as a response back to the browser. This means what what is viewed in the browser is the same image as what is saved... resolution and all. With higher DPIs and higher numbers of datasets to display simultaneously, it really takes a toll on a browser. I crashed my browser tab when trying to view all datasets in the Dev Human Hypothalamus profile.

carlocolantuoni commented 2 months ago

750 looks good for this example! - is there a way only to load the display image when viewing the page? and only load the saved image when downloading? so we dont crash things?

On Mon, Mar 11, 2024 at 11:18 AM Shaun Adkins @.***> wrote:

dpi150.png (view on web) https://github.com/IGS/gEAR/assets/5665914/bb540b6f-d133-4560-80f1-67ec3de661a2 dpi750.png (view on web) https://github.com/IGS/gEAR/assets/5665914/a572ce42-234f-4011-80de-3a32a4233eb0

These two images are of the same dataset (~150000 cells), but the top one is 150 dpi and the bottom one is around 750 dpi (formula int(150 + (#num_filtered_cells / 500))). if you save each image and zoom into each you can see the differences in detail. @carlocolantuoni https://github.com/carlocolantuoni is 750 dpi enough for this dataset?

One issue I encountered with changing the dpi value is that the displayed and saved image is rendered in the browser, and with higher DPIs and higher numbers of datasets to display simultaneously, it really takes a toll on a browser tab. I crashed my browser tab when trying to view all datasets in the Dev Human Hypothalamus profile.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-1988691657, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7VQ2YO6JVKNPONBME3YXXKNPAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGY4TCNRVG4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

adkinsrs commented 2 months ago

750 looks good for this example! - is there a way only to load the display image when viewing the page? and only load the saved image when downloading? so we dont crash things? On Mon, Mar 11, 2024 at 11:18 AM Shaun Adkins @.> wrote: dpi150.png (view on web) https://github.com/IGS/gEAR/assets/5665914/bb540b6f-d133-4560-80f1-67ec3de661a2 dpi750.png (view on web) https://github.com/IGS/gEAR/assets/5665914/a572ce42-234f-4011-80de-3a32a4233eb0 These two images are of the same dataset (~150000 cells), but the top one is 150 dpi and the bottom one is around 750 dpi (formula int(150 + (#num_filtered_cells / 500))). if you save each image and zoom into each you can see the differences in detail. @carlocolantuoni https://github.com/carlocolantuoni is 750 dpi enough for this dataset? One issue I encountered with changing the dpi value is that the displayed and saved image is rendered in the browser, and with higher DPIs and higher numbers of datasets to display simultaneously, it really takes a toll on a browser tab. I crashed my browser tab when trying to view all datasets in the Dev Human Hypothalamus profile. — Reply to this email directly, view it on GitHub <#667 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7VQ2YO6JVKNPONBME3YXXKNPAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGY4TCNRVG4 . You are receiving this because you were mentioned.Message ID: @.> -- Carlo

Sadly I don't think this is possible. As I mentioned, on the server side the display as saved as a byte stream using the DPI specified in the "save_dpi" parameter (what I adjusted). This bytestream will grow larger or smaller depending on the DPI specified. Those bytes are sent from the server as a response back to the browser which will recreate the image using that data. So the displayed image and the downloaded one are the same as they originate from the same byte stream.

The only way I could see splitting the two would be to make an explicit image save to /tmp or somewhere, then writing an explicit web function to find the /tmp image to download, which a) is a lot of work and b) can create a lot of waste images on the server.

adkinsrs commented 2 months ago

Another thing I am researching is if any of the other supported file types that matplotlib can save as are smaller in file size

adkinsrs commented 2 months ago

One potential solution I have found is to use "webp" format. WebP is a newer image format that compresses images 23% smaller than PNGs but retains the same image quality. This would require updating the Matplotlib and Pillow python packages, but I tested locally and it does work.

Another alternative would be to output the tSNE plots as SVG graphics, which could be a good candidate since we are essentially drawing a bunch of circles on a page. And SVGs scale infinitely being vector images. However the image is actually larger than the original 750 dpi PNG, so I would actually suggest switching to WebP image format. Another issue with SVG is that if a user downloads the SVG image, then the user most likely needs to convert to a format like PNG that would be easier to edit

carlocolantuoni commented 2 months ago

ok, thnx for researching - 2 questions - 1 - is WebP a format users will be able to view/use/edit as they need? 2 - is the 23% decrease in size enough to avoid problems with large profiles? want to ask @jorvis or others for advice here? these hi res images are important for the big spatial datasets, but i dont want to cause 10 new problems by getting the hi res images.

On Mon, Mar 11, 2024 at 3:20 PM Shaun Adkins @.***> wrote:

One potential solution I have found is to use "webp" format. WebP is a newer image format that compresses images 23% smaller than PNGs but retains the same image quality. This would require updating the Matplotlib and Pillow python packages, but I tested locally and it does work.

Another alternative would be to output the tSNE plots as SVG graphics, which could be a good candidate since we are essentially drawing a bunch of circles on a page. And SVGs scale infinitely being vector images. However the image is actually larger than the original 750 dpi PNG, so I would actually suggest switching to WebP image format

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-1989250230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7VPGEEXJLGTCNJFV3DYXYGYVAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBZGI2TAMRTGA . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

adkinsrs commented 2 months ago

I would have to test to see if the size decrease is enough to not crash things... checking on nemo-devel would be perfect for this.

As for "webp" format, all major browsers support the format. You can download the files, but if you want to use software to edit it, it may be hit or miss though native support will increase over time. For instance, I used Preview on Mac to edit my downloaded webp file, and the app prompted me to convert the file to TIFF format before editing.

adkinsrs commented 2 months ago

Implemented the "webp" solution at nemo-devel. Also setting saved figure DPI to equal int(150 + (#num_filtered_cells / 500)))

adkinsrs commented 2 months ago

Also... it can be found in docker/requirements.txt but I upgraded matplotlib to 3.6.1 and Pillow to 10.2.0

adkinsrs commented 2 months ago

Alternate thought I had.

Always send the 150 dpi bytestream as a server response to display the tsne on the page. Then if the recommended dpi (see formula above) is above a threshold like 300 or 500, then we save to the server, create a "download higher quality button" that will pull that image when clicked. This way, we don't save every tsne image but only the one with more cells.

adkinsrs commented 2 months ago

Yet another thing to consider

have pagination on the expression/projection pages that will show 10 or so datasets at a time. We would lazy-load the datasets on the current page and this would save on used browser memory

carlocolantuoni commented 2 months ago

This download higher quality button idea sounds perfect to me

On Wed, Mar 13, 2024, 16:47 Shaun Adkins @.***> wrote:

Alternate thought I had.

Always send the 150 dpi bytestream as a server response to display the tsne on the page. Then if the recommended dpi (see formula above) is above a threshold like 300 or 500, then we save to the server, create a "download higher quality button" that will pull that image when clicked. This way, we don't save every tsne image but only the one with more cells.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-1995755585, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7UUF3G65THMF2UX3ALYYC3PJAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJVG42TKNJYGU . You are receiving this because you were mentioned.Message ID: @.***>

adkinsrs commented 2 months ago

This download higher quality button idea sounds perfect to me

I think what I would do is just have the button there regardless and that button would just rerun the tSNE with higher DPI for downloading purposes. That way, we can use the lower DPI options as usual for viewing and let the user control which images they want a higher quality picture of (plus I can use PNG instead of WebP if people don't like that). However the download would be on a per-display basis (no download-all functionality).

carlocolantuoni commented 2 months ago

got it - sounds good

On Thu, Mar 14, 2024 at 2:57 PM Shaun Adkins @.***> wrote:

This download higher quality button idea sounds perfect to me

I think what I would do is just have the button there regardless and that button would just rerun the tSNE with higher DPI for downloading purposes. That way, we can use the lower DPI options as usual for viewing and let the user control which images they want a higher quality picture of (plus I can use PNG instead of WebP if people don't like that). However the download would be on a per-display basis (no download-all functionality).

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-1998127577, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7SI7QAAXCOA4WWITN3YYHXJNAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJYGEZDONJXG4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

adkinsrs commented 1 month ago
Screenshot 2024-03-18 at 11 20 41 AM

Added a new "download PNG" button if we have a scanpy plot, and I could expand this to Plotly and SVG in the future if requests arise. This button will download the higher-quality PNG

I had some issues where I noticed that scanpy was inconsistently returning incorrect bytestreams when the tSNE API call is called for the regular and the high-dpi plot, which would lead to small images (or low-res if you make them larger). I am still investigating this, but I think matplotlib may be having some thread-safe issues. Ultimately, I moved the DPI setting for scanpy to right before I saved the figure and it seems to generate fine now and consistently

I also rewrote some of the tSNE image display code, to make use of the URL.createObjectURL which is a modern JS thing. The nice thing about this is a) the URL looks cleaner upon inspection as the bytestream is saved to a Blob object, and b) you can run URL.revokeObjectURL free up the memory which was not an option when the datastream is encoded directly into the HTML img tag. However I am not running the "revoke" code for the time being since if you right-click to save (the lower quality) image, it cannot find the data since the memory was freed up.

carlocolantuoni commented 1 month ago

to test this im trying to add datasets to profiles in devel and i am unable to do so. everytie i try to add any dataset to any profile i get this error:

Fail. Sorry, something went wrong. Please contact us with this message if you need help. (Error: 200 SyntaxError)

also when i try to view datasets currently in profiles for example with this link: https://devel.nemoanalytics.org/expression.html?gene_symbol=PCP4&gene_symbol_exact_match=1&is_multigene=0&layout_id=dbcb79b1 i am getting these errors:

[image: image.png]

On Mon, Mar 18, 2024 at 12:42 PM Shaun Adkins @.***> wrote:

Screenshot.2024-03-18.at.11.20.41.AM.png (view on web) https://github.com/IGS/gEAR/assets/5665914/d66c912a-fd37-44a1-b484-89f3dabbdf02

Added a new "download PNG" button if we have a scanpy plot, and I could expand this to Plotly and SVG in the future if requests arise. This button will download the higher-quality PNG

I ended up continuing to use PNG for the time being. Noticed that scanpy was inconsistently returning incorrect bytestreams with the webp, which would lead to small images (or low-res if you make them larger). I will probably investigate that in the future to see if I can make it more consistent since I do like the WebP format.

I also rewrote some of the tSNE image display code, to make use of the URL.createObjectURL which is a modern JS thing. The nice thing about this is a) the URL looks cleaner upon inspection as the bytestream is saved to a Blob object, and b) you can run URL.revokeObjectURL free up the memory which was not an option when the datastream is encoded directly into the HTML img tag. However I am not running the "revoke" code for the time being since if you right-click to save (the lower quality) image, it cannot find the data since the memory was freed up.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-2004418279, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7WZU2NGA3BWKZOGNZDYY4KOJAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUGQYTQMRXHE . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

adkinsrs commented 1 month ago

Looking into the error with the URL link now... something up with the ortholog mapping server-side call.

For the "add datasets to profile" issue, I assume you are doing this using the dataset explorer, right? I am currently rewriting that page, and that rewrite is not on the devel server. We could try to diagnose the issue immediately or wait for me to finish the updated page and test it then.

adkinsrs commented 1 month ago

Looking into the error with the URL link now... something up with the ortholog mapping server-side call.

This should be resolved. Python packages on the devel server needed to be updated.

carlocolantuoni commented 1 month ago

great

On Fri, Mar 29, 2024 at 9:25 AM Shaun Adkins @.***> wrote:

Looking into the error with the URL link now... something up with the ortholog mapping server-side call.

This should be resolved. Python packages on the devel server needed to be updated.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-2027242784, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7TLDGVLWGAYNGQ3HD3Y2VTTBAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGI2DENZYGQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

carlocolantuoni commented 1 month ago

sounds better to wait for the updated page - how long might that be?

On Fri, Mar 29, 2024 at 8:28 AM Shaun Adkins @.***> wrote:

Looking into the error with the URL link now... something up with the ortholog mapping server-side call.

For the "add datasets to profile" issue, I assume you are doing this using the dataset explorer, right? I am currently rewriting that page, and that rewrite is not on the devel server. We could try to diagnose the issue immediately or wait for me to finish the updated page and test it then.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-2027179990, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7TIHRFN2FTDDTRH35TY2VNAHAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGE3TSOJZGA . You are receiving this because you were mentioned.Message ID: @.***>

-- Carlo

adkinsrs commented 1 month ago

I can have it pushed there by next week sometime. If I don't get all functionality resolved, I can at least get the stuff relevant for you to add to profiles implemented.

carlocolantuoni commented 1 month ago

Ok thnx

On Fri, Mar 29, 2024, 11:12 Shaun Adkins @.***> wrote:

I can have it pushed there by next week sometime. If I don't get all functionality resolved, I can at least get the stuff relevant for you to add to profiles implemented.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-2027367261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7RLOOG6NJEOW222HNTY2WAFPAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGM3DOMRWGE . You are receiving this because you were mentioned.Message ID: @.***>

carlocolantuoni commented 1 month ago

any luck with the dataset manager re-write and/or getting it onto devel? (related to the "add datasets to profile" issue above) or were u just going to do fixes?

On Fri, Mar 29, 2024 at 11:12 AM Shaun Adkins @.***> wrote:

I can have it pushed there by next week sometime. If I don't get all functionality resolved, I can at least get the stuff relevant for you to add to profiles implemented.

— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/667#issuecomment-2027367261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7RLOOG6NJEOW222HNTY2WAFPAVCNFSM6AAAAABEPUNGSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGM3DOMRWGE . You are receiving this because you were mentioned.Message ID: @.***>

adkinsrs commented 1 month ago

i am going to respond via email to keep this ticket relevant to only the topic of this ticket. You can either respond thru email or thru ticket #671