Hi, Rachel!
Congrats for your script! A very useful code!
I just have a couple of comments...
It seems to be an issue with the "glob.os.chdir(loc)", if I try to run it for a second time, or re-run, an error appears:
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:/Users/correndo/Desktop/Coding/Rachel/semesterProject-master//Sample_Datasets'
Also, I'm not completely sure if you really need the GLOB module to work here. In my case, which is also about managing data and creating files, I could complete all my script just with Pandas...
I think that it would be better if you split the code in more cells, one is too long. If a problem appears, it would be difficult to find out the issue and so on.
Personally, I would recommend to use, at least, one cell for nitrate and another for ureide.
I understand that you're using a loop to read the several files in a given folder and then it's a nested code...but probably there is a way to divide it a little bit.
Acronyms. There are a couple of acronyms or abbreviations that potential users won't be able to understand what are they for. For example, "NDF", I think you're using DF for dataframe, but i other cases you are using lowercase "df" for the same. Also, you have "B", "C" and "O" and I couldn't realize what are they standing for.
When you rename columns
ure_final.rename(columns = {'Absorbance_x':'Each', 'Absorbance_y':'Mean'}, inplace=True)
I think you could do it directly in the previous line
ure_final=ure_df.merge(ure_g.mean().rename("NEW NAMES"),on='Sample_ID')
These are the only comments and suggestions, but certainly, you're accomplishing your goals with your script!
Hi, Rachel! Congrats for your script! A very useful code!
I just have a couple of comments...
It seems to be an issue with the "glob.os.chdir(loc)", if I try to run it for a second time, or re-run, an error appears: FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:/Users/correndo/Desktop/Coding/Rachel/semesterProject-master//Sample_Datasets' Also, I'm not completely sure if you really need the GLOB module to work here. In my case, which is also about managing data and creating files, I could complete all my script just with Pandas...
I think that it would be better if you split the code in more cells, one is too long. If a problem appears, it would be difficult to find out the issue and so on. Personally, I would recommend to use, at least, one cell for nitrate and another for ureide. I understand that you're using a loop to read the several files in a given folder and then it's a nested code...but probably there is a way to divide it a little bit.
Acronyms. There are a couple of acronyms or abbreviations that potential users won't be able to understand what are they for. For example, "NDF", I think you're using DF for dataframe, but i other cases you are using lowercase "df" for the same. Also, you have "B", "C" and "O" and I couldn't realize what are they standing for.
When you rename columns ure_final.rename(columns = {'Absorbance_x':'Each', 'Absorbance_y':'Mean'}, inplace=True)
I think you could do it directly in the previous line ure_final=ure_df.merge(ure_g.mean().rename("NEW NAMES"),on='Sample_ID')
These are the only comments and suggestions, but certainly, you're accomplishing your goals with your script!
Best,
ADRIAN