AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Update alevin/kallisto workflows to use new spliced/spliced_intron references #81

Closed allyhawkins closed 3 years ago

allyhawkins commented 3 years ago

In going to run some of the snRNA-seq samples through Alevin and Kallisto, I noticed we need to incorporate using the new index files, spliced_txome_k31, we generated for both, as well as the updated tx2gene.tsv list that corresponds to these index's.

For now I'm thinking we can just change to the default being the spliced_txome_k31 index and then specify use of the spliced_intron_txome_k31 or ensembl generated index at the command line, but we will probably want to change this if we end up using one of these workflows permanently.

Also, in attempting to run some of the snRNA-seq samples using the spliced_intron_txome_k31 index, more memory is required, so we will also need to update the nextflow config files accordingly.

jashapiro commented 3 years ago

If we have of sense of the different memory requirements for the different indices, I think we can adjust the nextflow resources request on the fly based on an input variable.

allyhawkins commented 3 years ago

Good to know about being able to adjust the memory requirements.

Kallisto is having an issue with being able to locate the index for pre mRNA even if the path is hard coded as the default index, after increasing the memory, it spends 10 minutes running and then claims that the file is missing. I'm worried that one may be an issue with the actual index... the index file is 45 gb vs the spliced.txome index only being ~ 3 gb. Which doesn't seem right so I will go back and take a look at that.