SATAY-LL / LaanLab-SATAY-DataAnalysis

This contains codes and workflows for data analysis regarding SATAY experiments.
Apache License 2.0
4 stars 3 forks source link

Combine information regarding reads, insertions and genomic features from different sources in one variable for easier access. #26

Open Gregory94 opened 3 years ago

Gregory94 commented 3 years ago

For processing and plotting of the data from the SATAY experiments, typically many different files need to be combined. To make this a bit easier, the python script 'genomicfeatures_dataframe_with_normalization.py' is developed that creates a dataframe that includes information about various features in a chromosome (e.g. genes, telomeres, centromeres, ARS, etc.) and the number of reads and insertions. It also includes a normalization procedure from 'reads_normalization.py'.

Gregory94 commented 3 years ago

The dataframe created by genomicfeatures_dataframe_with_normalization.py looks as follows:

dna_df2

It currently includes the following information:

To use this function in any python script, make sure that the input matches the help-description at the beginning of the function and that all the required files are present (the location of the files are noted in the function). The output is the variable dna_df2 that includes the dataframe.