Open lizzieinvancouver opened 1 year ago
I was fixing a small issue with spp names (issue #449) and noticed an entire cleaning file is gone. I used:
git log --full-history -- analyses/cleaning/clean_duplicates.R
to track back to here and I think this accidentally deleted by @DeirdreLoughnan here which is commit 85ae4d88863e71335d023e0ba14235dcfd25a5fd ...
Sorry @FaithJones I don't think you should be on this issue, as you were just deleting this file via a merge (see commit 605039e2a1da54647cfe6c1d08bdd67fa89b8ffb).
In commit # d2c70339fd8ebabb8bf076d4166c7c6ffc380de6 ... I added it back in by just copying and pasting from when it was delete (see here for exactly what I copied).
The previous commit before the deletion appears to be from 2017.
Re-run ... deletes 234 rows, which (according to our August 2019 notes) is 5 less (was 239) than before we deleted cleaning/clean_duplicates.R in Jan 2021, but the best we can do. See commit # 45ded0f1211b76522ac208b9be0aa6c60f86f287 for updated data.
@DeirdreLoughnan Could you review this issue and double-check my thinking that maybe a mistaken deletion you did when working on the clean_duplicates for traitors (check the commit links I have)? I know it was a while ago, but see what you think ....
Could you also do a quick comparison of ospree_clean.csv as we have it now versus the August 2020 version (you can download the old version by either grabbing/renaming what you have now before you git pull my changes) with the new version -- see here for the history of commits on this file. Thanks!
I am not sure how best to handle this ... we likely should rerun EVERYTHING with the updated data to be safe.... that would mean redoing everything in this file. To discuss once we hear what @DeirdreLoughnan finds.
@lizzieinvancouver I am so sorry, this is a huge mistake on my part and not something I remember doing on purpose!
I will also see if I can get the original version from off of my old hard drive and compare it to the current files.
@lizzieinvancouver again I am really sorry for all the trouble this mistake has caused!
I was able to get a copy of analyses/cleaning/clean_duplicates.R off of my hard drive. The oldest version I have is from January 17, 2019, long before I ever modified it.
When I source this code in the cleaning/cleanmerge_all.R file I also get 234 rows of data being removed.
I also compared different versions of the ospree_clean.csv file. The version of the file that I had previously, which for me was Oct 9, 2020 (I must not have pulled frequently back then...the version I have before this is May 13 2020), seems to be the exact same as the version that you just pushed.
I even double checked this and used the FileMerge program on my old Mac and the only differences between my old version 2020 version of ospree_clean.csv and the new is that the species epithet from Malyshev18 was changed from "pseudolatanus" to "pseudoplatanus".
is it possible that the note from August 22 2020 is wrong? Let me know how else I can help fix the mess my mistake caused!
I am so sorry, this is a huge mistake on my part and not something I remember doing on purpose!
@DeirdreLoughnan This is not a huge mistake at all! I can easily see how it may have happened and no matter how it happened, it could have happened to anyone. I am just skimming your other reply now as I have to run to a meeting, but it could be that the Aug 2020 note was wrong ... thanks for all your effort on this and I will look more later.
@DeirdreLoughnan
When I source this code in the cleaning/cleanmerge_all.R file I also get 234 rows of data being removed.
is it possible that the note from August 22 2020 is wrong?
Definitely possible! I started those notes to keep somewhat track sometimes, but we were not committed to them nor always careful about updating them, but we never put in print commands or such, so lots of room for human error.
I also compared different versions of the ospree_clean.csv file. The version of the file that I had previously, which for me was Oct 9, 2020 (I must not have pulled frequently back then...the version I have before this is May 13 2020), seems to be the exact same as the version that you just pushed.
I even double checked this and used the FileMerge program on my old Mac and the only differences between my old version 2020 version of ospree_clean.csv and the new is that the species epithet from Malyshev18 was changed from "pseudolatanus" to "pseudoplatanus".
Thank you! I just pulled an older ospree_clean versus yesterday's and compared too. I found lots of special character differences (oy, the town of As is a pain) and then the pseudoplatanus (which started all this) but nothing else. So I think we're good! We'll just have to try to update all the data someday. I will leave this open until then.
@cchambe12 updated the chilling files -- yay! But I am getting an error in bb_cleanmergeall.R ...
here ...
> source("bb_analysis/cleaning/clean_ambientforcingfromdailyclimate.R") # 4 Dec 11959: (still 11404)
Error in if (length(unique(clim.i$year == year.end.i & clim.i$doy == doy.end.i)) == :
the condition has length > 1
In clean_ambientforcingfromdailyclimate.R I am fairly sure it's happening here:
else if (length(unique(clim.i$year==year.end.i & clim.i$doy==doy.end.i))==1 & unique(clim.i$year==year.end.i & clim.i$doy==doy.end.i)==FALSE){
at i=49 ... but I cannot track down why exactly. @cchambe12 if you have any ideas, let me know.
@lizzieinvancouver For some reason I didn't get an error... which feels weird to me! I pushed the new cleaned ospree data to the repo and am doing some checks now but I think it looks okay
Yes! Seems okay!
I assume unintentionally ...