genome / scrna_mutations

Supplementary data for Petti, et al 2019 scRNA mutation publication
16 stars 13 forks source link

Thank you! #1

Closed rstatistics closed 5 years ago

rstatistics commented 5 years ago

Thank you!

-Waite Chueng

chrisamiller commented 5 years ago

Tagging @ifiddes-10x who may be able to provide more information

ifiddes-10x-zz commented 5 years ago

Hi,

To execute the gex-depth-position pipeline, you will need to have Martian installed. You will also need to have my annotation pipeline CAT installed because this script uses some of the library functions. You don't need to install anything from CAT but the python parts (pip install git+https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit).

Once you have martian installed, you need to set up your invocation .mro file. I have provided an example file. You will need to set the transcripts value to the path of a genePred format annotation file. You can generate this file by using the Kent tool gff3ToGenePred on a standard GENCODE genePred. You will need to point possorted_bam and cell_barcodes to the output of Cellranger you are wanting to run this against. valid_chroms is a chromosome whitelist. Any transcripts in the genePred file not on any of these will be discarded. Finally, the kit_type field accepts either 3' or 5' and simply transforms the coordinates appropriately so that the resulting data file is in transcript orientation.

Once you have this set up, you can execute the script with export MROPATH=$REPO/mro; mrp example.mro runfolder --jobmode=$JOBMODE where $REPO is the location you cloned this folder to and $JOBMODE is one of the standard batch systems that cellranger supports.

Let me know if you have any other questions. This script is set up this way because it would take quite a long time to execute on a single thread across a large number of transcripts. If you are only interested in a small number of transcripts, or have time to spare, I can spend some time to re-write it in a simpler martian-free format.

rstatistics commented 5 years ago

@ifiddes-10x @chrisamiller Many thanks to your useful reply. I have run it successfully now. And part of the results is as follows: ################################################## $ head results.csv ,tx_id,tx_position,read_molecule_fraction 0,ENST00000457698.1,1644,0.05263157894736842 1,ENST00000457698.1,1645,0.05263157894736842 2,ENST00000457698.1,1646,0.05263157894736842 3,ENST00000457698.1,1647,0.05263157894736842 4,ENST00000457698.1,1648,0.05263157894736842 5,ENST00000457698.1,1649,0.05263157894736842 6,ENST00000457698.1,1650,0.05263157894736842 7,ENST00000457698.1,1651,0.05263157894736842 8,ENST00000457698.1,1652,0.05263157894736842 ################################################## I would to know if the result is in the correct format. Thank you in advance.

ifiddes-10x-zz commented 5 years ago

Yes, that looks correct to me. The columns are transcript_id (tx_id), position (position from capture site as defined by the kit type), and read_molecule_fraction (number of UMI seen at this position of the transcript divided by the total number of UMI seen for this transcript).

On Wed, Dec 19, 2018 at 6:42 PM ncrna notifications@github.com wrote:

@ifiddes-10x https://github.com/ifiddes-10x @chrisamiller https://github.com/chrisamiller Many thanks to your useful reply. I have run it successfully now. And part of the results is as follows: ################################################## $ head results.csv ,tx_id,tx_position,read_molecule_fraction 0,ENST00000457698.1,1644,0.05263157894736842 1,ENST00000457698.1,1645,0.05263157894736842 2,ENST00000457698.1,1646,0.05263157894736842 3,ENST00000457698.1,1647,0.05263157894736842 4,ENST00000457698.1,1648,0.05263157894736842 5,ENST00000457698.1,1649,0.05263157894736842 6,ENST00000457698.1,1650,0.05263157894736842 7,ENST00000457698.1,1651,0.05263157894736842 8,ENST00000457698.1,1652,0.05263157894736842 ################################################## I would to know if the result is in the correct format. Thank you in advance.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/genome/scrna_mutations/issues/1#issuecomment-448838487, or mute the thread https://github.com/notifications/unsubscribe-auth/AhnBnWlN3cfrdvIFuSZ1qEiO1MXcbXsiks5u6vkDgaJpZM4ZZlTa .

-- Ian Fiddes, Ph.D. Computational Biologist 2 10x Genomics

rstatistics commented 5 years ago

Sorry for reply late. Thank you very much!