Open JinqianPan opened 1 month ago
Hi. I guess you could go into folder dinov2, select all .pt files, and try downloading. Also, you could try to find some download tools for Onedrive SharePoint.
If you already downloaded the original WSI files (with .svs extension), you could use our provided DINOv2 feature extractor to extract the features.
Thank you for your reply!
The first way, as previously mentioned, involves selecting all .pt files, which is feasible only when the file size is under 20GB; alternatively, could you divide the .pt files into 17 folders, each containing up to 20GB? So that we could download 17 times for different part of data. I am still searching whether there is a tool could download the huge data from Onedrive or SharePoint. As for the final way, which involves downloading original WSI files, I am looking for a way to filter data from the GCD data portal to avoid downloading the entire 8.9PB of data.
If you are downloading to your personal PC, using Onedrive Windows/Mac to synchronize the data folder rather than downloading directly should be a better solution. After that, upload the files from your personal PC to linux server.
For TCGA Data Portal, you could simply download the .svs files of TCGA program using the filters provided by the website. There are many tutorials online.
Hi and thank you for continuing help and support.
I'm trying to download the feature files as suggested but I cannot synchronize your data folder as it hasn't been explicitly shared with my one drive account. If I provide my email would you be willing to do this or do you have an alternative suggestion? I've looked at other options including scraping or even downloading manually, but strangley I'm not even able to reorder your files by file size which would make this process easier. I have downloaded the SVS files, but pre-processing them will take a considerable amount of time with my current resources.
Hi and thank you for continuing help and support.
I'm trying to download the feature files as suggested but I cannot synchronize your data folder as it hasn't been explicitly shared with my one drive account. If I provide my email would you be willing to do this or do you have an alternative suggestion? I've looked at other options including scraping or even downloading manually, but strangley I'm not even able to reorder your files by file size which would make this process easier. I have downloaded the SVS files, but pre-processing them will take a considerable amount of time with my current resources.
Hi, yes please attach your email and I will manually include your account into the viewer list.
Many thanks although I cannot see the share under my one drive account? Have you setup correctly?
Many thanks although I cannot see the share under my one drive account? Have you setup correctly?
That's weird. Maybe try to use this link directly: https://hkustconnect-my.sharepoint.com/:f:/g/personal/zguobc_connect_ust_hk/EhmtBBT0n2lKtiCQt97eqcEBvO9WwNM3TL9x-7-kg_liuA?e=1N4FHk
After you enter the link, choose all folders and files. After that, there is a button named "Copy to". Click that button should make it possible to copy the files to your own Onedrive.
I can't see an option to specify my account. It only allows me to copy to an existing location in your one drive?
I can't see an option to specify my account. It only allows me to copy to an existing location in your one drive?
Is there any tutorial about syncing this folder to your Onedrive from my side?
Try this. You should be able to specify my email address as a viewer of the folder. Thanks again for helping me with this.
So strange!
Although I recevied your email I cannot see it in my one drive. The issue is the limitation on downloading the zip. Would it be at all possible to you win zip to create a multi-file zip? Say 5-10gb for each file. This will remove the issue and allow me to download the 30-50 separate files. Winzip should make it easy to do this.
10gb files seem to be ok. I'm not sure what the limitation is? I think it's around 15-20gb.
Yeah the limitation of Onedrive can be disturbing. I will try if I can move these files to Google Drive or somewhere else. But this might take a lot of time since I didn't save these files locally on my PC.
I suggest running the feature extraction code. And to speed up for that, you could run that script multiple times. For example, you could generate a reversed csv file and then run the feature extraction code on original csv and the reversed csv, which should save you half the time.
Thanks for trying. Good suggestion. Will do.
Would you mind removing my personal email address from the comments (is possible).
Sure thing. No worries.
Many thanks although I cannot see the share under my one drive account? Have you setup correctly?
That's weird. Maybe try to use this link directly: https://hkustconnect-my.sharepoint.com/:f:/g/personal/zguobc_connect_ust_hk/EhmtBBT0n2lKtiCQt97eqcEBvO9WwNM3TL9x-7-kg_liuA?e=1N4FHk
After you enter the link, choose all folders and files. After that, there is a button named "Copy to". Click that button should make it possible to copy the files to your own Onedrive.
The 'Copy to' bottom might only work on your own account. It is weird that we could not copy to the files into our Onedrive.
Hi,
I am working on how to download data from Sharepoint. The official limit of Sharepoint's single zip download is 20GB, but it is obvious that this is far from enough. I wonder if there is any way to break through this limitation, so that could download data?