NatalieDreux / KujLabGen

Things like skyline & MATLAB specific to the Kuj lab, but still useful application
0 stars 0 forks source link

KujLabCoding class with Brianna and Yuting #3

Open NatalieDreux opened 6 months ago

NatalieDreux commented 6 months ago

3/4/24 Meeting Recap

After the first meeting it was decided that we should meet bi-weekly, allowing for more time for me and Herman to take on the challenge assigned that week. The main goals of this meeting being: to gain a general understanding of computational methods used in the Kujawinski lab. This includes MATLAB, Skyline, and visualizing data.

My deliverables for next meeting 3/11/2024

Natalie to come with questions on nomenclature, concepts, etc.

NatalieDreux commented 6 months ago

Deliverables for 3/11/24

  1. Can we maybe have a time to go over github/repositories/etc. ?

    something I was thinking about considering that is how most code is shared these days and I have project boards, but am clueless when it comes to repositories

  2. I took the online courses on the MATLAB website and have pretty detailed notes on the very very basics, really I'm just looking at how to take that, and apply it to more complicated scripts, strings, etc. & I have a lot better of a time asking questions with a bit more context. I have not yet tried to do anything Kuj lab related in MATLAB yet. So...

    • Basic syntax of incorporating skyline into MATLAB, similar to the talk the three of you gave in lab meeting (maybe a bit more entry lvl)
    • What is our GOAL in MATLAB in terms of what we do in the Kuj lab? What's it doing to the skyline data?
  3. Pulling out and visualizing specific data from a large set.

  4. loops: knockout

  5. honestly anything ......

NatalieDreux commented 5 months ago

(1)read a data table (2) run a script while using a saved function (3) make a standard curve plot

NatalieDreux commented 4 months ago

Practice 1 - Sorting a table

% TransitionList_SkyMat_Example.xlsx % Let's read the xlsx file into a data table called transitionTable % functions you might need: cd readtable

cd \Users\natal\Documents\WHOI\MATLAB\CodingGroupMeeting\4.1.24 Practice\Practice1

% From MATLAB website: T = readtable("filename")

TransitionTable = readtable('TransitionList_SkyMat_Example.xlsx'); %semicolon used to prevent printing an output

% Let's sort the data table based on column ionMode % function you might need: sortrows % From MATLAB: % B = sortrows(A) % B = sortrows(A,column) % B = sortrows(,direction) % B = sortrows(,Name,Value) % [B,index] = sortrows(___)

SortedTable = sortrows(TransitionTable, 'ionMode');

% Let's remove all of the rows in the negative ion mode % function you might need: strcmp % example code to remove the first row of a table named A % A(1,:)=[]

s1 = 'negative'; s2 = SortedTable.ionMode; X = strcmp(s1,s2);

SortedTable(X,:) = [];

% Finally, we will save the updated transiton list into an excel % spreadsheet % function you might need: writetable

NatalieDreux commented 4 months ago

Practice 2 - Plotting St. Curve

% Now read the spreadsheet SkyMat_3isotopes_test_pos.xlsx

Spreadsheet = readtable('SkyMat_3isotopes_test_pos.csv');

% and use getErros.m to calculate the r2, slope, intercept % based on areas of the light isotope vs standard concentrations(ng) _Thoughts: Need to make "alaninedata" a variable to plug into the script below

alanine_data = Spreadsheet;

% here are some scripts that you may need

k = strcmp(alanine_data.SampleType,'Standard')&... strcmp(alanine_data.FragmentIonType,'precursor')&... strcmp(alanine_data.IsotopeLabelType,'light');

&: This is the logical AND operator. It performs element-wise logical AND operation between the three logical arrays obtained from the comparisons in steps 1, 2, and 3. As a result, it returns a single logical array where each element is true only if the corresponding elements in all three logical arrays are true

alanine_std_light = alanine_data(k,:); clear k _Gave me a filtered table with the rows corresponding to light precursors for the standard samples for each molecule. Still need to find a way to pull out a table for each metabolite ... really that should be the 'alaninedata' variable but , cant figure it out

Asked chatGPT: User "Starting from an existing .csv file. How can I filter the table by column to contain only select inputs such as "Molecule Name" in MATLAB?"

alanineRows = strcmp(alanine_std_light.MoleculeName, 'alanine'); alanineTable = alanine_std_light(alanineRows, :);

aminobenzoicacidRows = strcmp(alanine_std_light.MoleculeName, '4-aminobenzoic acid'); aminobenzoicacidTable = alanine_std_light(aminobenzoicacidRows, :);

glutamicacidRows = strcmp(alanine_std_light.MoleculeName, 'glutamic acid'); glutamicacidTable = alanine_std_light(glutamicacidRows, :);

Gives me a table corresponding to the filtered values for specific metabolite of interest

% now you need to define xdata and ydata, which are the inputs needed to % run getErrors.m

xdata = alanineTable.AnalyteConcentration; ydata = alanineTable.Area;

% use the addpath function to add getErrror.m in your function path -i.e. working directory-

addpath('C:\Users\natal\Documents\WHOI\MATLAB\CodingGroupMeeting\4.1.24 Practice\Practice2');

getErrors(xdata,ydata) define output in order to avoid using another .m function and having the 'ans' variable if used later

% Now let's plot the standard curve of alanine % functions we use in considerSkyline include plot and fitlm

FROM CHAT GPT by asking: Using these inputs already, how can I plot the st. curve using MATLAB fucntions? plot(xdata,ydata) & fitlm(xdata,ydata)

% Fit linear regression model

lm = fitlm(xdata, ydata);

% Get coefficients of the fitted model

coefficients = lm.Coefficients.Estimate;

% Plot scatter plot of data

scatter(xdata, ydata, 'filled'); hold on;

% Plot regression line

x_range = min(xdata):0.1:max(xdata); % Define x range for plotting regression line y_range = coefficients(1) + coefficients(2) * x_range; % Calculate corresponding y values plot(x_range, y_range, 'r', 'LineWidth', 2);

% Now change title to 'Alanine pos', change x-axis label to 'ng' % and y-axis label to 'light area' % Add labels and legend

xlabel('ng'); ylabel('light area'); Title('Alanine pos');

hold off;

% Export the figure to a pdf file and close the figure window % Set the filename for the PDF file

pdf_filename = 'Practice2figure.pdf';

% Export the figure to a PDF file with best fit or fill page options

print(gcf, pdf_filename, '-dpdf', '-bestfit'); Use '-bestfit' for best fit option

% below is likely going to be the take-home assignment

% use getErros.m to calculate the r2, slope, intercept % based on alanine light/heavyD5 ratio vs standard concentrations(ng) % train of thought: make a table similar to the first ADDING the % data for the heavyD5

Spreadsheet = readtable('SkyMat_3isotopes_test_pos.csv');

To modify the string comparison to also pull the 'heavyD5' label from your spreadsheet in the Isotope label type column, you can use the logical OR operator (|). % Here's how you can adjust the comparison:

X = strcmp(Spreadsheet.SampleType,'Standard')&... strcmp(Spreadsheet.FragmentIonType,'precursor')&... (strcmp(Spreadsheet.IsotopeLabelType, 'light') | strcmp(Spreadsheet.IsotopeLabelType, 'heavyD5'))&... strcmp(Spreadsheet.MoleculeName, 'alanine');

extra = Spreadsheet(X,:); clear X

could also have created another table using the same k function to call on the 'light' & by changing the code to say 'heavy' and then solving the ratio by calling on the Area column for each table and using the './' for the corresponding values in each table

%Find the ratio of light/heavyD5 Asking ChatGPT how to pull these values from the spreadsheet: (actual language stored in Chat history)

% Extract the columns for the ratio calculation

lightarea = extra(1:11, 14); % Assuming column 14 contains the area values heavyarea = extra(12:22, 14); % Assuming column 2 contains the label (light or heavy)

% Calculate the ratio

ratio = lightarea ./ heavyarea; % ./ --> uses the corresponding elements in each (11x1) array ratio = [0.00861993179512163;0.00303260689506943;0.00542939842789403;0.00679379937466825;0.0133417898851875;0.0549046166343983;0.0959596059336590;0.469175147275201;0.867242556362756;3.81002227241090;7.05651902667263];

% Pulling Std. Conc. values

std = extra(1:11, 7); std = [0; 0.001; 0.01; 0.05; 0.1; 0.5; 1; 5; 10; 50; 100];

Formatting was not working with the getError function so I made arrays using the values given

getErrors(std,ratio)

% doing what was done above for these variables

lm = fitlm(std, ratio);

% Get coefficients of the fitted model

coefficients = lm.Coefficients.Estimate;

% Plot

x_range = min(std):0.1:max(std); % Define x range for plotting regression line y_range = coefficients(1) + coefficients(2) * x_range; % Calculate corresponding y values plot(x_range, y_range, 'r', 'LineWidth', 2); % Label xlabel('ng'); ylabel('light/heavyD5 ratio'); title('Alanine pos');

hold on;

% Save

pdf_filename = 'Practice2ExtraFig.pdf';

% Export the figure to a PDF file with best fit or fill page options

print(gcf, pdf_filename, '-dpdf', '-bestfit');

Practice2ExtraFig.pdf

Practice2Figure.pdf