kundajelab / chrombpnet

Bias factorized, base-resolution deep learning models of chromatin accessibility (chromBPNet)
https://github.com/kundajelab/chrombpnet/wiki
MIT License
124 stars 34 forks source link

Input for training cell type-specific models #194

Closed marzamKI closed 4 months ago

marzamKI commented 5 months ago

Hi! I would like to train models for each cell type found in my scATAC-seq dataset, and I am a bit unsure what is best to use for input peaks and non-peaks. I have created a fragment file for each cell type. Should I call peaks and create non-peaks from each of these, or should I use a set of peaks called on the whole dataset? I tried training models for both options and it looks like the latter, with the common set of peaks, gives me slightly better results (R = .735 vs R = .714). But is it the right way to go? Or is it more appropriate to call peaks on each cell type fragment file separately?

panushri25 commented 4 months ago

Hey, for a cell-type specific model, call peaks on each cell type file separately.

panushri25 commented 4 months ago

Please reopen this if you have any more questions!