We had several conversation around how to deal with an issue that old L2G model was using only protein coding genes and our GS list is also only protein coding. It is unclear how to deal with it for now in the most optimal way but we decided to add a feature column, binary variable, whether gene is protein coding or not. It suppose to solve all our problems potentially, but we have to keep in mind that on the feature importance it is going to be the most important feature probably.
We had several conversation around how to deal with an issue that old L2G model was using only protein coding genes and our GS list is also only protein coding. It is unclear how to deal with it for now in the most optimal way but we decided to add a feature column, binary variable, whether gene is protein coding or not. It suppose to solve all our problems potentially, but we have to keep in mind that on the feature importance it is going to be the most important feature probably.