Closed ArnovanHilten closed 11 months ago
Thank you for the encouraging words!
When evaluating the peaks are centered on the summit column. It is possible for a peak to have a summit at greater than 1000 bp or have multiple summits.
in your question above - chrombpnet_nobias, is that the model without the bias model part? or is it the model that provides predictions without bias?
I think both questions are pointing to the same thing, this is the model with out the bias part and hence provides predictions without bias. Hope that make sense, let me know otherwise I will try to clarify better.
yes chrombpnet.h5 is the full (final) model.
Hope this answers your question. Happy to answer any more questions you may have, please reply back here. I will close this as addressed for now.
Thank you for your response, it is clear now. However, I think that it would be good to provide a warning or a message if the input region is larger than 1000 bp. From the description given, it is not super clear that only the region 500 bp around the summit column is evaluated (and that the first columns are not used).
In my case, I had to merge some peaks because it threw an error because the BigWig was not accepting regions that were not ordered (I had overlapping regions). When I merged them with bedtools I was surprised to see only 1000 bp predictions for these larger regions.
How do you handle this yourself? Do you only evaluate the 1000 bp for larger regions and will this not give a bias?
Hi @panushri25
I really like chrombpnet and it is working great! I have one question regarding the output and the input bed files.
If I understand the code correctly the networks predicts on a region of 1000 bp. These are also saved in the h5 output with pred_bw and contribs_bw. When I was preparing my input files I was therefore surprised to find a filtered.peaks.bed with a 10th column that had a value higher than 1000.
Is it not necessary to have the 10th column (summit) to be smaller than 1000 (I guess around 500)? Are regions that have a start and end that are more than 1000 bp apart automatically split? I could not find this in the code but maybe I failed to find it. Are these peaks currently evaluated as a single region of 500 pbs around the summit instead of the whole region?
Thank you for developing such a great tool!
Best,
Arno
PS.
THe documentation for models is a bit confusing to me:
Just to be sure; chrombpnet_nobias, is that the model without the bias model part? or is it the model that provides predictions without bias? If I understand it correctly chrombpnet.h5 is the full (final) model.