dohlee / chromoformer

The official code implementation for Chromoformer in PyTorch. (Lee et al., Nature Communications. 2022)
GNU General Public License v3.0
34 stars 3 forks source link

demo_meta.predicted.csv format details #6

Closed ytang0831 closed 1 year ago

ytang0831 commented 1 year ago

HI! I noticed that in the output csv, the column are gene_id,expression,eid,label,chrom,start,end,strand,split,neighbors,scores,prediction.

what does "prediction" meaning? log2expression or prediction accuracy?

Many thanks!

ytang0831 commented 1 year ago

Also, how can I get the result of predicted RPKM?

dohlee commented 1 year ago

Sorry for the late reply. In the previous version of the model (as in your case), the prediction column represents the probability of a gene expression being greater than the median (1) or not (0).

At this moment we are undergoing a major update to the repository for the usability of the model, and you can get predicted RPKM by training ChromoformerRegressor model.

Please let me know if there are any difficulties using our model. Thanks!

ytang0831 commented 1 year ago

@dohlee, Thanks for your update, but I found a problem.

When I want to implement chromoformer regression prediction on the test set, I used run_demo.py because there is only the trian.py script but no test script. But run_demo.py is a script for binary predictions. predictions.append(torch.sigmoid(out.cpu()).numpy()[:, 1]) should be changed to predictions.append(out.cpu().numpy())?

dohlee commented 1 year ago

Please refer to run_demo_regression.py. It might solve the problem.

Closing this issue. Please open a new one if you have any further issue!