[ ] Functions should do 1 thing - please separate reading files from filtering
[ ] Please take a look at this file:
ftp://massive.ucsd.edu/RMSV000000267/2019-09-03_mnchoi_df98cda5/quant/180821_Yeast_TKO_m13306_PD22_Int_01_PSMs.txt
it's the same output from a different piece of software. Your app should work with different kinds of outputs. I think that the best solution is to have parameters that will name the column with sequences, proteins and q-values (or other measures of PSM quality).
[ ] Please check what happens when there are multiple csv files in a given folder
[ ] Btw these files usually come in .txt format, so the function should also be able to read non-csv files
[ ] The function should be able to read a single file or multiple files, not just two
[ ] There can be more than 2 proteins matched - separate function should work with an arbitrary number of proteins in a column
[ ] Filtering (line 31) should be optional - we are often interested in both shared and unique peptides. Instead, add a logical column that indicates if the peptide is shared or not
[ ] Lines 38-40 are not necessary (we treat inputs from these files separately)
[ ] Function search_protein probably should take the data frame as a parameter (we don't use global variables)
[ ] The second method does not work for me
[ ] Are you sure that the matrix from the 1st method is a 0-1 matrix? Please check and correct if needed