Closed gonzalezeb closed 6 years ago
@gonzalezeb, can you find the last commit where the table you need is still there?
https://github.com/forestgeo/allodb/commits/master
--
You could do a manual binary search.
Example: Take this 9 points in history: 1, 2, 3, 4, 5, 6, 7, 8, 9
The second commit below
commit 3a8d5b8c711fd602ba807fe69a001eb489b5a7f2
This is how I rescued references.csv:
git checkout 3a8d5b8c711fd602ba807fe69a001eb489b5a7f2
Manually copy the file into another folder
git checkout master
Manually paste the file into data-raw/
Now we are still left with the problem of linking id to references, right?
Can you please find a commit where there is a master table that contains the references and equation_allometry
? I think I could match equation_allometry
to try find the id.
This commit f8962bc4ce00c625fe20183c93184a1f5904240c seems to be the last where the master table and equation_allometry where together..
But the real problem is that I never gave a ref_id. I think my idea was to use something like: first 4 letter of lastname then year, i.e. Clar_1985 (I prefer this way to numbers). Maybe I have to do that by hand?
OK, thanks! I'll have a look tomorrow.
@gonzalezeb, I think I got something useful for you to tweak and finalize the references table.
I added the file data-raw/data_references_id.csv which contains the new column author_year
following your suggestion. I have not overwritten ref_id
because it is not entirely NA
s. Please check that problem, and merge the two columns ref_id
and author_year
into one.
What will happen if two equations have the same combination of author and year? Is kind of working for now but I'm not 100% sure its safe in the long run. Also, there are 32 references in data_references.csv but 35 in data_references_id.csv: Please see what's going on.
Once you are done you may want to move the clean table to data-raw/csv_database/ and continue editing it. Also it'd be nice to tidy data-raw/ by removing the leftover master and references tables.
To your question: What will happen if two equations have the same combination of author and year? Let's use Krista's system on a previous pub as citation or reference ID:
Citation ID in the form [last name of first author][year][first letter of first four words of title, when applicable].
The final reference table should have the following columns:
ref.id | ref.doi | ref.author | ref.year | ref.title | ref.journal |
---|
@gonzalezeb
I updated data-raw/data_references_id.csv to reflect your request. You'll need to check the data, remove ref_id
, then rename refid
as ref_id
.
Let me know your questions.
@gonzalezeb, the problem is that this new id doesn't help in linking the equations table with the references table -- which is what we wanted in the first place. That is because the master
table we recovered has no information about reference title. It only has biomass_equation_source
which contains author and year.
I think I could use author_year
once, as I was doing before, to first link the two tables, i.e. references and equation. Then create a reference id formatted as you requested and use that format from then on.
@gonzalezeb, I believe this closes this issue: https://github.com/forestgeo/allodb/blob/master/inst/issues/57.md
I cannot find the reference table, I see the reference_metadata in csv_database. I though I kept that table there, where I would feed the reference data (citation) for each equation.
But I just realized a big mistake. In the equations table we have a 'ref_id' but I never gave an id to each reference (each row) in the reference table (now lost).
I think I need help from @maurolepore