sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

Missing samples from the gene_fpkm.csv file #44

Closed laurahspencer closed 2 years ago

laurahspencer commented 2 years ago

Looks like we're missing data for 4 samples (only have 22 out of the 26 RNASeq .bam files).

sr320 commented 2 years ago

26 bam files are here https://gannet.fish.washington.edu/Atumefaciens/20210726_cvir_stringtie_GCF_002022765.2_isoforms/

sr320 commented 2 years ago

sounds like https://github.com/epigeneticstoocean/2018_L18-adult-methylation/blob/main/analyses/gene_fpkm.csv

is incomplete?

kubu4 commented 2 years ago

Hmmm... Four samples (no sure which ones yet) get dropped during this merge step:

https://github.com/epigeneticstoocean/2018_L18-adult-methylation/blob/fd53c74e8838f0d277043d6180f4f698612e6a45/code/ballgown_analysis.Rmd#L727

Not sure I have the availability to troubleshoot for long, so if someone can also investigate (maybe it's a simple problem with the merge() command I implemented), that'd be killer.

kubu4 commented 2 years ago

Okay, found the issue. Substring command leaves an errant . in four sample names. Will work on fix...

kubu4 commented 2 years ago

Fixed with this commit.