Closed andrewvanbreda closed 5 months ago
@sacrevert OK thanks for letting me know, I was aware of that danger and deliberately tried to avoid something like that. Maybe it was working on the other columns too when I didn't realise it. We need another run anyway as we also have the preferred species issue with the NVC code.
@sacrevert Good spot, interestingly though, that issue doesn't look like being caused by an NA replacement as the data file looks fine. I will just have to check the code to see if there is any NA replacement handling in their that I didn't realise.
Just let me know if you spot anything else. Keep in mind that I have tested on my machine also, so you may wish to take that in't account when deciding how my to test.
I will let you know when I am in a position to instruct Biren for a rerun of any bits that need it
@andrewvanbreda I have done a bit more work on the community level attributes, and can confirm that they are loaded, but have not worked out how to link the values to actual community names just yet. It might be easier if you can provide some examples; I've quite a bit of other work on at the mo, and although I'm sure I will work it out, some example tests would save some time. Cheers
@sacrevert OK no problem, I will get that to you. Prob not today, but probably by tomorrow
@andrewvanbreda OK, I did actually get my head back around this, and it seems OK. The following gave the answers that I would expect given the raw data.
select * from indicia.termlists_term_attribute_values ttlav join indicia.termlists_term_attributes ttla on ttla.id = ttlav.termlists_term_attribute_id AND ttla.deleted=false join indicia.termlists_terms tt on tt.id = ttlav.termlists_term_id and tt.deleted=false join indicia.terms t on t.id = tt.term_id limit 100;
@andrewvanbreda One tiny thing, could you capitalise 'tree' in "tree/shrub height" for consistency (not your fault, this was in the original spreadsheet)? Cheers.
@sacrevert Yes I will change tree. I will make a note of it, might be safer to just change in the warehouse after importing, I will look at the importer and decide if there is any risk to changing importer
Hi @sacrevert,
At the moment the situation is
So I suppose my only query now before recommending another run on this would be to work out if we want a source like we have for Plant Att and Pantheon For Plant Att the attributes have the following source for comparison (although you don't have to use the same layout format for NVC)
Term: "PLANTATT - attributes of British and Irish plants."
PLANTATT source link: "http://brc.ac.uk/sites/www.brc.ac.uk/files/biblio/PLANTATT_19_Nov_08.zip"
PLANTATT source references: "Hill, M.O., Preston, C.D., & Roy, D.B. (2004). NERC Centre for Ecology & Hydrology: Monks Wood."
@andrewvanbreda OK fine, I will continue working on the TVK issue then.
The source terms should be: Term: "British Plant Communities (NVC floristic tables)"
NVC source link: "http://jncc.defra.gov.uk/page-4265"
NVC source reference: "Rodwell, J.S. (Ed.). (1991). British Plant Communities (5 Volumes). Cambridge University Press, Cambridge, UK."
@sacrevert OK, great I will add that too the code and then we can look to do another run of this next week
Source added to the taxa_taxon_list_attributes along with associated link/reference. Committed.
Note that the source_id doesn't appear on the termlists terms attributes table, so I will need to amend that table separately sometime. So the source is not on the communities attribute at the moment.
We can amend that simply in the db if needed after import, as it is just one id that needs adding.
termlists_term attributes now has source_id column. Will probably update this field manually after import though, as the field might not be present when the importer is run as it is currently on the development branch code
@sacrevert OK thanks. Am just about to look at the Plant Att situation this afternoon....need to refresh memory, will let you know my thoughts on NVC too as soon as I can
@sacrevert OK thanks, I think the code has changed with this importer since last run (unlike PlantAtt which hadn't changed), and we also have a new importer file. However a major advantage with this importer is we haven't done a test warehouse run yet, so we can do a clean run on that. So I suggest roughly the same course of action as PlantAtt, but with a test warehouse run also. I think the tests after that run need to be a bit more detailed than PlantAtt as PlantAtt the code looks fine, but with NVC we noticed an issue where the Species Constancy Value wasn't imported during the dev warehouse run (although I couldn't find a reason why that happened). My suggested steps are,
Again will let you know how I get on.
Great. Hopefully the 'Species' column in this new file should map exactly onto the species names in the full set of NVC table information, so hopefully the new TVKs will slot right in.
Nested in #46
Again similarly to the thread I just opened for the Plant Att import, I am opening a thread for the floristic import. I haven't taken a look at it yet to refresh my memory but there looks to be some committed code for it. I am less sure that this is complete though in comparison to the Plant Att importer. Will take a look and let you know if it is ready for a run, however my guess is there are some outstanding questions for it