AllenInstitute / abc_atlas_access

Documentation and examples demonstrating how to access data from the Allen Brain Cell Atlas
https://alleninstitute.github.io/abc_atlas_access/
Other
49 stars 21 forks source link

Issue on page /intro.html #40

Open stefanonard85 opened 7 months ago

stefanonard85 commented 7 months ago

Hi all,

Few requests: 1-It is unclear where the final merfish and snRNAseq datasets are on AWS; no README is supplied, and there are several folders with no description. 2-The repository is a mess, and half of the sections are damaged. 3-By the way, your viewer is slow and does not supply any insight since you can not even plot two genes at the same time and you are not plotting each single transcript. We are experts in the hypothalamus and the Pons, and I can tell half of your clusters are wrongly assigned. 4-Also, where are the files with the cell segmentation, the single gene transcript coordinates, and the DAPI?

Stefano Nardone, PhD

tylenolncuff commented 7 months ago

Hi Dr. Nardone, apologies for the delay. FYI for the future, it's fine post here, but for faster responses, you can reach us at our Community Forum.

  1. If I understand you right, we fully agree that navigating S3 folders is tough. Our recommendation is to use the notebooks in this GitHub repo (like this one), and those will guide you through access and using the files more deftly. If that doesn't work for you, I'd love to hear more about the gaps/challenges.
  2. By "repository" and "sections" are damaged, do you mean a) this GitHub repo + the notebook "sections" or b) the S3 bucket + the merscope sections?
  3. We're always working on performance, so thanks for that feedback. I'd love to connect someone with you to learn more about a) your needs/asks of the ABC Atlas and b) opportunities for you to help contribute to the cell type annotations. Could I have them email you for more info?
  4. I believe many or all of these files are available at the BIL archive, which is linked from the ABC Atlas for future reference. Unfortunately, those directories are eminently difficult to navigate, so I'm going to ask one of our team members to post a guide on our Community Forum and then update you here when it's available. image
stefanonard85 commented 7 months ago

Hi Taylor,

Sorry for the follow-up. Thanks a lot for your reply, and sorry for the misunderstanding.

I would like to point out that the file names in the BIL archive do not correspond to the ones on AWS. This makes it nearly impossible to associate these files stored in different locations. For instance, I wanted to use your data for a publication we have in preparation, but it is nearly impossible to understand the criteria used to store the files. I understand there are two papers and 6 main batches (500 genes=2 batches=paper1 and 1100 genes=4 batches=paper2). No README file explains what these files are and where they are located. The names of the files in the two locations do not correspond.

On top of that, I have plotted the MERFISH data, but all the sections of the paper are half, making it very hard, if not impossible, to understand the exact bregma level. In addition, many areas have an apparent angle, meaning they were sampling different levels.

Finally, we have performed MERFISH on the entire dorsal pons using 1 million cells only from that region (42 coronal sections), and results, except some discrepancies align. You'll need additional annotation.

My point is that this manuscript still represents a valuable resource. As an example of the valuable resource it represents, I am citing my work in which we used the snRNA-seq data from both ABA publications. I want to state that issues are related only to the MERFISH dataset.

Reproducibility and accessibility of the data and accessible results are a must in a journal, mainly if you publish such an essential piece of work as I believe it is. With this, I do not mean that analyses and clustering have been done incorrectly; it is just an annotation issue that requires time. Unfortunately, the field is very competitive, and the technologies are proceeding so fast. All this requires a considerable amount of time. I would be delighted and honed if I could speak to somebody on the team to understand more about the data and give some suggestions for the spatial dashboard since we have just realized one using accelerated GPU.

I'm looking forward to working with you.

Kind Regards, Stefano Nardone, PhD