Open mingwhy opened 1 year ago
https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/tutorial_in#download I download and install it on hyak server
#after log in hyak, require for computation resource
$ srun -p build --time=6:00:00 --mem=200G --pty /bin/bash
$ pwd
# /gscratch/csde-promislow/mingy16/build_fly_ref
$ curl -o cellranger-6.1.1.tar.gz "https://cf.10xgenomics.com/releases/cell-exp/cellranger-6.1.1.tar.gz?Expires=1630567777&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZi4xMHhnZW5vbWljcy5jb20vcmVsZWFzZXMvY2VsbC1leHAvY2VsbHJhbmdlci02LjEuMS50YXIuZ3oiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2MzA1Njc3Nzd9fX1dfQ__&Signature=Ta4L6k9JMVaQMbsbl07sYeqFlijZXArBRvGT2Q3V2Z4Fg9gT69TpSIvPqUFJ4mybjjXnL-HjIyAXGfjDfG11a8BQs5FOlJOpm3q6VtJpwkztKaNBhcKSLTXJyuhb5ZTaHb1DsQmL8d0u0hPU0Vs6TAxjgMqAcvtArvslFRgk2laN3V7FdLy20HeaxPdhTtnAsTW4WSt4C7r8LHV3mKJytMjFgN2IxPStnEplHCYuXNhzkwm00E61uLZvJ6fch1E4L2DtSWYZjstnfNqH6Ke4a1z4xYpv7rBXXxGmf8jDcWOFe~2UoZIsmlZKJPE1ncRnFRwnwRNVyiOURCjytQztmQ__&Key-Pair-Id=APKAI7S6A5RYOXBWRPDA"
tar -zxvf cellranger-6.1.1.tar.gz
cd cellranger-6.1.1/
# add cellranger to your home path
vim ~/.bashrc
# export PATH=/gscratch/csde-promislow/mingy16/build_fly_ref/cellranger-6.1.1:$PATH
source ~/.bashrc
which cellranger
#/gscratch/csde-promislow/mingy16/build_fly_ref/cellranger-6.1.1/cellranger
online tutorial: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/tutorial_mr
I mainly followed this one: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/tutorial_mr#macaque_6.0.0
Look at the README on 'http://ftp.ensembl.org/pub/release-104/fasta/drosophila_melanogaster/dna/' I downloaded 'Drosophila_melanogaster.BDGP6.32.dna_rm.toplevel.fa.gz'.
$ curl -o http://ftp.ensembl.org/pub/release-104/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.32.104.chr.gtf.gz
$ wget http://ftp.ensembl.org/pub/release-104/fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.32.dna_rm.toplevel.fa.gz
$ gzip -cd Drosophila_melanogaster.BDGP6.32.104.chr.gtf.gz > Drosophila_melanogaster.BDGP6.32.104.chr.gtf
$ cellranger mkgtf \
Drosophila_melanogaster.BDGP6.32.104.chr.gtf \
Drosophila_melanogaster.BDGP6.32.104.chr.filtered.gtf \
--attribute=gene_biotype:protein_coding --attribute=gene_biotype:lincRNA --attribute=gene_biotype:antisense --attribute=gene_biotype:IG_LV_gene --attribute=gene_biotype:IG_V_gene --attribute=gene_biotype:IG_V_pseudogene --attribute=gene_biotype:IG_D_gene --attribute=gene_biotype:IG_J_gene --attribute=gene_biotype:IG_J_pseudogene --attribute=gene_biotype:IG_C_gene --attribute=gene_biotype:IG_C_pseudogene --attribute=gene_biotype:TR_V_gene --attribute=gene_biotype:TR_V_pseudogene --attribute=gene_biotype:TR_D_gene --attribute=gene_biotype:TR_J_gene --attribute=gene_biotype:TR_J_pseudogene --attribute=gene_biotype:TR_C_gene
$ gzip -cd Drosophila_melanogaster.BDGP6.32.dna_rm.toplevel.fa.gz >Drosophila_melanogaster.BDGP6.32.dna_rm.toplevel.fa
$ cellranger mkref --genome=BDGP6.32 --fasta=Drosophila_melanogaster.BDGP6.32.dna_rm.toplevel.fa --genes=Drosophila_melanogaster.BDGP6.32.104.chr.filtered.gtf
A folder named 'BDGP6.32' is generated, and it's done~
10x single cell data preprocessing
open web page, click ENDPOINTS, click 'Create a personal endpoint', it would ask you if you'd like download a Globus installation dmg file.
Use Globus 'Preference' panel to delete previous Globus.
Then re-install, it would ask you to confirm your email and user name.
After installation, open the web page again, it should show your local end point on the file transfer page and on your 'Bookmarks' -> 'Your Connections' page.
use Globus to download data.
You need to install Globus client, and delete previous 'connection'. create a new connection on Globus and sync/transfer data.
The downloaded file contains a lot of files, what you need is all in the path: /Users/mingyang/Downloads/Promislow_Lab/fly_DP_10brains_done/outs/fastq_path/HFWWGDRXY/
We have 4 samples, each sample was ran on two lanes. There are 4 files (index1, index2, R1, R2) for each sample per lane. What you need is the fasta files F1_S2_L001_R1_001.fastq.gz and F1_S2_L001_R2_001.fastq.gz.
Tip: Your FASTQ files must follow the Illumina naming convention, ex. SampleName_S1_L001_R1_001.fastq.gz.
for example: F1_S2_L001_I1_001.fastq.gz F1: sample name S2: sample number based on the order that samples are listed in the sample sheet L001: the lane number R1—The read. In this example, R1 means Read 1. For a paired-end run, there is at least one file with R2 in the file name for Read 2. When generated, index reads are I1 or I2. 001—The last segment is always 001
our purchased kit: Chromium Next GEM Single Cell 3’ GEM, Library & Gel Bead Kit v3.1, 4 rxns PN-1000128 https://support.10xgenomics.com/single-cell-gene-expression/library-prep/doc/user-guide-chromium-single-cell-3-reagent-kits-user-guide-v31-chemistry our dual index kit: the Dual Index Kit TT Set A (PN-1000215) https://kb.10xgenomics.com/hc/en-us/articles/360036953011-Where-can-I-find-the-Dual-Index-Kit-TT-Set-A-PN-1000215-sample-index-sequences-
sign up or sign in 10x cloud computing platform.
https://www.10xgenomics.com/products/cloud-analysis https://support.10xgenomics.com/cloud-analysis/billing
create project on 10x cloud and upload fastq data.
once you upload data, 10x cloud store it for free within 90days, then \$0.02 per GB per month
!!No reads quality processing done at this point!! Following https://support.10xgenomics.com/cloud-analysis/uploading-fastqs#download, I install 'the 10x Genomics Cloud CLI', then upload fastq data.
I install it in my home folder: /Users/mingyang/txg-macos-v1.1.1/
enter the token on the website and the data upload step starts right away.
I also install txg tools on the server:
and on server: /gscratch/csde-promislow/mingy16/txg-macos-v1.1.1/
start analyze data on 10x Cloud (data storage free time: 90 days)
select library type: as we use SI-TT-A5 ~ SI-TT-A8. I emailed 10x tech people, our case is: 'ST', standing for 'standard', library type. https://support.10xgenomics.com/cloud-analysis/supported-products
create your own transcriptome referencer
when you start analysis, you need to use a transcriptome reference, 10x doesn't have fly genome as the reference, you need to upload one yourself.
reference upload guidelines: https://cloud.10xgenomics.com/cloud-analysis/custom-references https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/references
check out
build_fly_ref.txt
file.upload your own transcriptome reference
Upload following https://support.10xgenomics.com/cloud-analysis/custom-references#upload
You need to enter your token, you can find it in your Account Setting (https://cloud.10xgenomics.com/account/security) For me, it's 0a20f71b186d02f6ae0e023c27f4d75b8181e634d232c5c184e52197a6f72b77
txg
tool would verify all needed files in this ref and upload them, ~2mins.create analysis on 10x cloud
click on the reads files and 'create a new analysis'.
If you've already uploaded your own reference, you'd be able to see it show up in 'Transcriptome reference'.
Then just start the analysis, after the analysis is launched, you would see a message like this: