gagneurlab / absplice

33 stars 4 forks source link

How to add specific tissues like the retina into the source code #24

Closed Eyulaochen closed 1 year ago

Eyulaochen commented 1 year ago

I noticed that specific tissues like the retina are not included in the existing information. Could you kindly provide guidance on how to incorporate additional tissues, such as the retina, into your source code? Thanks for your help!

WagnerNils commented 1 year ago

Hi, in order to run AbSplice for a certain tissue, you need to provide a SpliceMap of that tissue. We currently provide SpliceMaps for tissues available in GTEx (retina is not there). However, creating a SpliceMap for a new tissue is straightforward, as long as you have RNA-seq data. To create SpliceMaps see this example notebook. As an input you need to provide a count table with split-read counts for each junction and sample (see Readme of splicemap package). For the split-read counting you should use a tool that can detect de-novo splice sites. We compared FRASER, RegTools and STAR; which all led to similar results (see Supplementary Fig. 2a,c,d of the AbSplice publication). You do not need a huge cohort in order to create a SpliceMap and get robust estimates for Reference PSI values (see Supplementary Fig. 2e,f of the AbSplice publication). Let me know if you need any further assistance on that.

Eyulaochen commented 1 year ago

Thanks for your reply! Let me do it. If I have other problem, I will post here. Thanks again for your help!

Eyulaochen commented 1 year ago

I generated splicemap_5 and splicemap_3 in juyputer notebook. I should throw them to some folders of absplice and run it?

WagnerNils commented 1 year ago

yes. What genome version did you use? For hg19 and hg38 it is easy to do. In that case you just save your splicemaps as gzipped files and put them in the folder that is indicated in this config. It is the same folder that already contains the splicemaps that have been downloaded from the example usecase (/data/resources/downloaded_files/splicemap_{genome}/, where genome is either hg19 or hg38). In case you did not run the example, just create this folder and put the splicemaps there (so create a folder called downloaded_files/splicemap_{genome}/ in here. You need to name your splicemaps in the same way as in the config, so in your case e.g Retina_splicemap_psi3_method=kn_event_filter=median_cutoff.csv.gz, same for psi5. Then just add Retina to the list of tissues in the main config. From there just follow the steps from the Readme to run the workflow on your own vcf files.

Eyulaochen commented 1 year ago

I added retina 3 and 5 like you said, but it can't process error here Waiting at most 5 seconds for missing files. MissingOutputException in rule download_splicemaps in file /home/cheneyu/absplice/example/workflow/./download/Snakefile, line 97: Job 5 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait: ../data/resources/downloaded_files/splicemap_hg19/Retina_splicemap_psi3_method=kn_event_filter=median_cutoff.csv.gz ../data/resources/downloaded_files/splicemap_hg19/Retina_splicemap_psi5_method=kn_event_filter=median_cutoff.csv.gz Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-07-07T114724.268107.snakemake.log

Eyulaochen commented 1 year ago

How to forbid the download and get the right result?

Eyulaochen commented 1 year ago

I resolve those problems, thank you so much!