AlexsLemonade / scpca-docs

User information about ScPCA processing
https://scpca.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Add FAQ about why we chose Alevin-fry #21

Closed allyhawkins closed 3 years ago

allyhawkins commented 3 years ago

This PR starts the FAQ page and begins to address #10, specifically why did we choose Alevin-fry. I included a brief paragraph about why we chose Alevin-fry over Cell Ranger, specifically highlighting that Alevin-fry is more computationally efficient while giving results that are similar to Cell Ranger. Are these the points that we want to highlight or are there other points or a different direction I should be taking here?

I also thought this would be a good point to add in some of the graphs that we had used during benchmarking, so I made two boxplots that compare UMI/cell and genes/cell for just the alevin-fry configurations that we are using to Cell Ranger for two single-cell and two single-nuclei samples. For now these are the only plots that I included because I wanted to get some feedback on what people thought of this direction and then I can add more/ take away as needed. The other plot that I think would be good to add in is one of the scatter plots showing the correlation of mean gene expression of one or two samples with Alevin-fry and Cell Ranger. If others agree, I can also add that in.

I also included in this PR the script that I created to make the plots that are included. I wasn't sure if there was a good place to keep this (maybe it doesn't belong here but in a different repo?), but for now included it in the scripts directory and I can move it around or delete it as people see fit.

I'm leaving this in draft stage right now because I want to get initial thoughts on plots to include/not include and then will alter those. Tagging @jashapiro and @jaclyn-taroni for any thoughts on this.

Also including a screenshot of the FAQ page once it's built so you have it for reference.

Screen Shot 2021-10-11 at 5 09 14 PM
allyhawkins commented 3 years ago

Thanks for your feedback @jaclyn-taroni! Based on your comments I went ahead and did a few things:

  1. I took out the script and plots and filed them in a new PR in alsf-scpca, AlexsLemonade/alsf-scpca#141
  2. I added in temporary links to that branch right now, but once that PR is in/ we have decided on the plots we like then we can update the links.
  3. I removed use of cellranger count since that is talking about the actual function that is being used and refer to it throughout as Cell Ranger, just like we refer to all of the Alevin-fry functions together as Alevin-fry rather than stating the individual functions here.
  4. I added in a sentence at the beginning stating that the goal is to have data that is easily comparable to data processed by the user.

This should be ready for another look, and please let me know if I missed anything or misinterpreted any of your comments.

allyhawkins commented 3 years ago

I incorporated edits from both @jaclyn-taroni and @jashapiro and added in a sentence at the end about different pre-processing results giving compatible results downstream, incorporating the two references Josh mentioned. I also updated the figure links to include the updated commit hashes. Let me know if anyone wants another look otherwise I will go ahead and merge this.