Shen-Lab / ncVarPred-1D3D

Multimodal learning of noncoding variant effects using genome sequence and chromatin structure
5 stars 1 forks source link

Clarification Needed on Location of Epigenetic Profile Output Files #1

Closed JuseTiZ closed 1 year ago

JuseTiZ commented 1 year ago

Hello,

I've successfully executed the code as per the documentation provided in the repository. However, I've encountered an issue with the large number of output files generated, and I'm having difficulty identifying the specific files that contain the epigenetic profile data.

Could you please provide some guidance on where I might find the epigenetic profile? Specifically, is the profile stored in a .npy file format? If it is, could you indicate in which directory I should look for it? For instance, is it within the model_prediction directory or located somewhere else?

I appreciate your assistance in navigating the outputs.

Thank you!

JuseTiZ commented 1 year ago

Upon reviewing the model testing code in the repository, I've noticed that the output files seem to be named with the suffixes wt_prediction.npy and mt_prediction.npy. However, I'm currently trying to understand if it is possible to extract information about specific variants from these files.

From what I can observe, there appears to be no coordinate information within the files, which leaves me uncertain about how to associate the predictions with specific genetic variants. Could you please provide some insights on how to relate the contents of these .npy files to particular variants? Is there an additional step or a method within the codebase that I might have overlooked for this purpose?

Thank you for your time and help.

wuweitan94 commented 1 year ago

Thank you for your interest in our work. I hope the following example can make pipeline clearer.

Please let me know if there is still any confusion or if there is anything that I may help.

Best wishes.

JuseTiZ commented 1 year ago

The step-by-step breakdown has clarified the process significantly, and I now understand how to proceed with the identification and analysis of the variants using your pipeline.

I appreciate the time you took to elucidate each stage, from extracting the necessary variant information to interpreting the predictions for the epigenetic events. The example and specific instructions on handling potential mismatches are especially helpful.

If I encounter any further uncertainties or require additional assistance as I work through the data, I will reach out.

Thank you once again for your support and prompt response.