theathorn commented 4 years ago

@mvonpapen commented on Tue Sep 24 2019

Thank you for submitting a portal to the HCA DCP Methods Registry!

To expedite your portal's addition to https://data.humancellatlas.org/analyze, please provide the following package metadata. You can easily edit this information later by clicking "Improve this page" at the bottom of your portal's detail page (example).

Required:

Tool title: FASTGenomics
Contact name: Michael von Papen
Contact email: michael.papen@comma-soft.com
Who to attribute: CC LifeScience @ Comma Soft AG
Portal URL: www.fastgenomics.org
Short description: FASTGenomics is a platform to share scRNA-seq data and analyses. Users can either choose from best practices or create individual workflows for the exploration of gene expression data.

Optional:

Long description:

FASTGenomics - a platform to share single-cell RNA sequencing data and analyses using reproducible workflows

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany.

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses for exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

Note: We will soon offer interactive analyses based on Jupyter notebooks. Stay tuned for the upcoming beta-test!

REFERENCES Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

Logo or screenshot:

matthewspeir commented 4 years ago

Hello, @mvonpapen.

Thank you for submitting an analysis portal to the DCP. I will review your portal against our registry standards and let you know if we have any questions.

matthewspeir commented 4 years ago

Hello, @mvonpapen.

Thank you again for submitting an analysis portal to the registry. I'm hoping you can answer a couple of questions about how your portal adheres to our standards:

For the "Use Containers and Modules" standard, it looks like you have Docker containers for your tools at https://hub.docker.com/u/fastgenomics. I noticed that you have github repos for your python client and r client for your portal, but I didn't see a docker instance for either of these in Docker hub. Am I just not seeing them or are they not present? If they're not present, are there plans to add them?
For the "Register Upstream", I can't seem to find your platform, clients, or the related utilities registered in bioconductor or bioconda. Are there plans to register all of your stuff in one of these platforms? (It's also possible they're already there and I just missed them.)

Thanks!

Matthew

mvonpapen commented 4 years ago

Hi Matthew,

The python and r clients you listed are actually outdated and will be deposited soon. They were used to access our database from outside the platform. Right now, we are actually in a closed beta phase and will soon relaunch. For that, we set up two new clients, fgread-r https://github.com/FASTGenomics/fgread-r and fgread-py https://github.com/FASTGenomics/fgread-py, that are used to load data from our internal database within the platform ( https://prod.fastgenomics.org). As such, the new clients are part of the jupyter images that we provide for the analyses, see, e.g., https://hub.docker.com/r/fastgenomics/jupyter-scanpy.
In the future, we may again plan to develop clients for external access to our database. These clients could then well be added to bioconda and/or bioconductor. As our current clients are only working within the platform, we did not think about adding them to these repositories. We have also moved away from providing analysis apps (which might be added to bioconda/bioconductor) to providing complete Jupyter notebooks for the analyses. The platform itself belongs to Comma Soft AG https://www.comma-soft.com in Bonn, Germany, and will not be open-sourced.

I hope I could answer your questions. Please let me know if you need additional information.

Best reagrds, Mitch

Am Mi., 2. Okt. 2019 um 19:46 Uhr schrieb Matt Speir < notifications@github.com>:

Hello, @mvonpapen https://github.com/mvonpapen.

Thank you again for submitting an analysis portal to the registry. I'm hoping you can answer a couple of questions about how your portal adheres to our standards:

1.

For the "Use Containers and Modules" standard, it looks like you have Docker containers for your tools at https://hub.docker.com/u/fastgenomics. I noticed that you have github repos for your python client https://github.com/FASTGenomics/py_client and r client https://github.com/FASTGenomics/r_client for your portal, but I didn't see a docker instance for either of these in Docker hub. Am I just not seeing them or are they not present? If they're not present, are there plans to add them? 2.

For the "Register Upstream", I can't seem to find your platform, clients, or the related utilities registered in bioconductor or bioconda. Are there plans to register all of your stuff in one of these platforms? (It's also possible they're already there and I just missed them.)

Thanks!

Matthew

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HumanCellAtlas/data-portal/issues/540?email_source=notifications&email_token=AGIO6V6L4QEPC72GSZO34A3QMTM7TA5CNFSM4I2EQDC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAFTGPY#issuecomment-537604927, or mute the thread https://github.com/notifications/unsubscribe-auth/AGIO6V7TCLLU3IWK2JYDNJTQMTM7TANCNFSM4I2EQDCQ .

mvonpapen commented 4 years ago

Hi @matthewspeir,

Did you receive my response or do you need any additional information? Just to let you know, we will soon switch our platform to beta.fastgenomics.org. prod.fastgenomics.org will be closed at the end of the month.

Best, Mitch

mvonpapen commented 4 years ago

Dear @theathorn ,

A new version of FASTGenomics has been launched and therefore I hereby slightly update the original description. The updated submit is the following:

Tool title: FASTGenomics
Contact name: FASTGenomics Team
Contact email: contact@fastgenomics.org
Who to attribute: Comma Soft AG, Bonn
Portal URL: www.fastgenomics.org
Short description: FASTGenomics is a platform to explore, analyze and share scRNA-seq data with interactive Jupyter notebooks. Users can create individual workflows or choose from best practices notebooks in Python and R to explore gene expression data in various data formats.

Optional:

Long description:

FASTGenomics - a platform to analyze and share single-cell RNA sequencing data with reproducible Jupyter notebooks

webpage_alt

Recent technological advances enable genomics of individual cells, the building blocks of all living organisms. Single cell data characteristics differ from those of bulk data, which led to a plethora of new analytical strategies. However, solutions are only useful for experts and currently, there are no widely accepted gold standards for single cell data analysis. To meet the requirements of analytical flexibility, reproducibility, ease of use and data security, we developed FASTGenomics as a powerful, efficient, versatile, robust, safe and intuitive analytical ecosystem for single-cell transcriptomics [Scholz et al., 2018]. This development has been carried out at Comma Soft, Bonn, in collaboration with the Schultze lab at LIMES Institute in Bonn, Germany.

fg_jupyter

FASTGenomics is designed as a platform for single cell RNA-seq data open to the scientific community. A major feature is to provide highest reproducibility and transparency for single cell data analysis to the whole community. The platform provides publicly available datasets and analyses in the form of Jupyter notebooks for the exploration and visualization of gene expression data. Using docker containers provides full reproducibility and helps avoiding "works only on my machine" problems. Register now at https://www.fastgenomics.org and have a look at our collection of data sets and example analyses.

FASTGenomics serves as a platform, where you can share your data with the community and test novel algorithms on public data sets with known results. FASTGenomics scales already routinely to more than 300k cells per project and prototype apps suggest that scaling to 1M cells is also possible [Scholz et al., 2018]. Moreover, its hybrid design also allows custom solutions such as FASTGenomics on premise for clinical and pharmaceutical research facilities.

For more information, you can find us on Twitter, LinkedIn or come talk to us on Slack. To register for the platform visit us at www.fastgenomics.org.

REFERENCES Scholz et al. (2018) FASTGenomics: An analytical ecosystem for single-cell RNA sequencing data. BioRxiv, 272476.

Logo or screenshot:

matthewspeir commented 4 years ago

Hello, Michael.

I apologize for not responding sooner, I just returned from 3 weeks of traveling last Friday.

Based on your response, it doesn't sound like your tool meets the registry standards (specifically 'Be Free and Open Source' and 'Register Upstream'): https://data.humancellatlas.org/contribute/analysis-tools-registry/registry-standards. Can you please review all of the standards and describe how your tool meets the six 'Required' ones listed on that page?

mvonpapen commented 4 years ago

Hi @matthewspeir ,

I hope you had a good trip :-)

FASTGenomics is indeed a little different from the other analysis tools on the HCA site in the sense that we are an online platform and not a tool or (yet) a tertiary portal. Also, our "method packages" are Jupyter notebooks and not real packages or modules. However, all our analyses and datasets are open and accessible, so I think that we meet the registry standards.

So let's go through the registry standards one by one:

Be Free and Open Source Our readers (fgread_r and fgreadpy) as well as our "method packages" (the Jupyter notebooks, `analysis*`) are freely available on on our github page. Licenses are contained in the repositories. The images we use are freely available on our dockerhub page.
Use Containers and Modules The Jupyter notebooks are realized within docker containers that are freely available on our dockerhub page. What our platform does is the following: it loads the image from dockerhub and copies the data from our database together with the Jupyter notebooks from github into the container.
Register Upstream Our readers are registered on pipy (here) and can be installed via pip. As our "method packages" are plain Jupyter notebooks, we think it would not make sense to register them in an upstream registry. The images for FASTGenomics are publicly available in the docker registry.
Support Standard Data Formats Our readers currently support the most common standard data formats: rds, hdf5, h5ad, loom, csv, tsv, and soon mtx.
Document Installation and Usage We offer an instructive git repository analysis-test-environment, which can be used to run all analyses on a local machine. It comes with a documentation and an instruction of how to set it up, how to load the data and analyses and how to run everything with docker.
Provide Testing Data Test data in all supported formats can be found in the git repository test-data. The test data include 11 versions of all supported data formats including the 3k PBMC dataset from 10x.

Please let me know, if I'm missing something or if you need further clarification.

matthewspeir commented 4 years ago

Hi, @mvonpapen.

Thanks for providing those details. Just to be sure I'm understanding correctly, you're providing open-source tools to read data into and out of your closed platform? At least that's what I gather from looking at the Github readmes for the two fgreader tools. I'm not sure that this fits with the spirit of the DCP registry standards.

mvonpapen commented 4 years ago

Dear @matthewspeir ,

Yes, you are correct. We provide open-source-tools to read data into and out of our closed platform, and additionally to analyze this data. However, using the provided source code from git, people can actually do this without using our free(!) platform.

As I said earlier, we are very interested to get listed in the Analysis tools registry and we have already taken several steps to meet your standards. In contrast, we noticed that the Analysis tools registry even lists proprietary software such as the Bioturing Browser, which not only closed-source but also not free. Could you please explain what exactly you mean with us not fitting the spirit of DCP registry standards?

Best, Mitch

matthewspeir commented 4 years ago

Hi, @mvonpapen.

Admittedly, I did not write the standards and have only been involved in reviewing a few of the most recent submissions to the Analysis registry. I have been applying the standards as I understand them (and with some guidance from those who wrote them) to these submissions. Since I was not involved in adding the Bioturing Browser to the registry, I can't comment on what was involved there or how the decision was made.

I believe Jean Chang (@jlchang) and Tim Tickle (@TimothyTickle) were both involved in the creation of these standards. Perhaps they could comment on how your portal meets the DCP standards.

mvonpapen commented 4 years ago

Hi @matthewspeir ,

Thank you for the reference to @TimothyTickle and @jlchang. We are eager to meet your standards and some advice how to do that would help a lot at the moment.

Best, Mitch

mvonpapen commented 4 years ago

Dear @theathorn ,

We are still willing to provide all necessary information to be listed in the analysis portal registry. As I outlined above, all relevant information is publicly available in our github. In addition to the analysis code we also provide test-data as well as a test environment that is easy to set up.

I noticed that there are two commercial providers (BioTuring and DNAStack) in the analysis registry at the moment that offer similar services as we do. Therefore, I hope that there is a way that FASTGenomics can also be included in the registry.

I would be very happy to hear from you.

Best rregards, Mitch

theathorn commented 4 years ago

Hello @mvonpapen,

The DCP is currently in maintenance mode and not accepting new portal registrations, but we expect to resume the registration process in the coming months. I will update this ticket when that happens.

Trevor

NoopDog commented 3 years ago

Hi @mvonpapen The HCA data portal has once again begun accepting submissions to the Methods Registry. Can you take a look at this submission and make any required updates? We will then restart the approval process.

mvonpapen commented 3 years ago

Hi @NoopDog , I'm happy to hear that you re-opened for submissions. I've updated our description and look forward to proceed with the process.

Required:

Tool title: FASTGenomics Contact name: Team FASTGenomics Contact email: contact@fastgenomics.org Who to attribute: Comma Soft AG Portal URL: https://beta.fastgenomics.org Short description: FASTGenomics is a cloud-based collaboration platform providing data management and reproducible analyses for scRNA-seq and omics data. FASTGenomics allows you to collaborate in groups, share data, create individual analyses and publish interactive results.

Optional: Long description

FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data

Collaboration and data sharing is key in biomedical research. It involves experts from several fields of study such as Molecular Biology, Immunology, Data Science and Computer Science as well as storage and re-use of data in a reproducible environment. Our Life and Data Science experts at Comma Soft have therefore developed the open platform FASTGenomics, which provides a common infrastructure, smart data management, is easy to use and allows direct access to data and results. It thus acts as a single point of truth and brings together all collaborators of your project.

The aim of FASTGenomics is to provide highest reproducibility and transparency for single cell and omics data analysis to the whole community. The platform offers publicly available datasets, reproducible analyses and interactive projects for exploration and visualization of gene expression data. Docker containers provide full reproducibility and help avoiding "works only on my machine" problems.

FASTGenomics is an open-access platform and is used as the central data and analytics platform in various European research projects such as the Human Cell Atlas project discovAIR and the EU H2020 project SYSCID.

We are an experienced partner with a tight network of leading experts from Bioinformatics, Immunology and Pharma. Also, we are an active member of several academic networks such as the Human Cell Atlas Lung Biological Network, Sparse2Big, and Single Cell Omics Germany. Together, we can help you get started with your research project, assist in data management, and leverage the power of state-of-the-art AI-based techniques. Our hybrid design also allows custom solutions such as FASTGenomics on-premises for clinical and pharmaceutical research facilities.

Where to find us:

Twitter: @FASTGenomics Youtube: FASTGenomics channel Slack: Slack support channel Github: https://github.com/FASTGenomics Docker: https://hub.docker.com/u/fastgenomics

Logo or screenshot:

FASTGenomics_oneplace

mvonpapen commented 3 years ago

Hi @NoopDog , Do you have any updates regarding the submission process?

NoopDog commented 3 years ago

Hi, @mvonpapen we hope to have the latest round of applicants processed this week. Will be back to you in the next couple of days with an update. Thank you for your patience! Cheers, D

NoopDog commented 3 years ago

Hi, @mvonpapen I have reached out internally to get clarification on the policy for listing closed source commercial portals. I will update you when we hear back. It may take a week or so for my question to be processed. Thank you for your application to the registry and your patience! Cheers Dave Rogers

NoopDog commented 3 years ago

Hi, @mvonpapen I am happy to report that FASTGenomics has been approved for posting. Your page will be on our staging server shortly. I will let you know when it's up on our staging server so you can review it.

Cheers, Dave

NoopDog commented 3 years ago

Hi @mvonpapen your portal page is up here:

https://dev.singlecell.gi.ucsc.edu/analyze/portals/fastgenomics

Can you review and let us know if you would like any updates before we push this live?

Cheers, Dave Rogers

mvonpapen commented 3 years ago

Hi @NoopDog ,

That is great news! Thank you.

I do have some updates/issues:

the title figure has changed a little. please use the following title figure:
The Team name changed from currently "CC LifeScience @ Comma Soft AG" to now simply "Comma Soft AG"
please change the URL for the "View" button to "https://fastgenomics.org/login"
There are currently two titles, one above and one below the first figure. The title above the first figure is outdated (from the first submission). The new title should be: "FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data"
Please add a link to the platform under Where to find us: "Homepage: https://fastgenomics.org/login"
Contact: please change "Michael von Papen (michael.papen@comma-soft.com)" to "Team FASTGenomics (contact@fastgenomics.org)"
the FASTGenomics figure at the very bottom (the one that just spells FASTGENOMICS) can be removed

Thanks again, we are thrilled to be part of the HCA registry!

NoopDog commented 3 years ago

@mvonpapen we have made the requested changes and your page is now live. https://data.humancellatlas.org/analyze/portals/fastgenomics

If you would like any other updates please feel free to submit a PR. You can get started with a PR by selecting the "Improve this Page" link at the bottom of the page and editing the markdown source directly in GitHub.

Cheers, D

DataBiosphere / data-portal

FASTGenomics #540

Required:

Optional: Long description

FASTGenomics - a cloud-based collaboration platform for data management and reproducible analyses of scRNA-seq and omics data

Where to find us:

Logo or screenshot: