The code is readable but will benefit from more detailed docstring for the functions.
For example,
For the two functions that read in a file, the file type can be explained; tab-separated file, .csv or .txt file
For the quality_control() function, are the quality controls for the two datasets as well as the filter thresholds standard parameters for this type of dataset? Elaborating on why they were set up as such would be helpful.
The docstrings for the merge function looks great!
Yes, thank you for pointing out these issues. Most of the QC and its values are taken from either previous work or relevant literature. You are right in sense that I need to address this, however it will basically always be defined by some thresholds depending on the application and the dataset.
The code is readable but will benefit from more detailed docstring for the functions. For example,
The docstrings for the merge function looks great!