Hi! I would like to train models for each cell type found in my scATAC-seq dataset, and I am a bit unsure what is best to use for input peaks and non-peaks. I have created a fragment file for each cell type. Should I call peaks and create non-peaks from each of these, or should I use a set of peaks called on the whole dataset? I tried training models for both options and it looks like the latter, with the common set of peaks, gives me slightly better results (R = .735 vs R = .714). But is it the right way to go? Or is it more appropriate to call peaks on each cell type fragment file separately?
Hi! I would like to train models for each cell type found in my scATAC-seq dataset, and I am a bit unsure what is best to use for input peaks and non-peaks. I have created a fragment file for each cell type. Should I call peaks and create non-peaks from each of these, or should I use a set of peaks called on the whole dataset? I tried training models for both options and it looks like the latter, with the common set of peaks, gives me slightly better results (R = .735 vs R = .714). But is it the right way to go? Or is it more appropriate to call peaks on each cell type fragment file separately?