esturdivant-usgs / science-base-automation

Automating large USGS ScienceBase data releases
4 stars 2 forks source link

Since last commit, process creates an extra level of child pages #66

Closed esturdivant-usgs closed 5 years ago

esturdivant-usgs commented 5 years ago

Closed issue #51 may have addressed the same issue.

esturdivant-usgs commented 5 years ago

Likely added redundancy when I added the function rename_dirs_from_xmls() and reworked some of the process (trying to make the code more modular).

esturdivant-usgs commented 5 years ago

The print-out reads

CREATED PAGE: 'DisMOSH, Cost, MOSHShoreline: Distanc...' in 'DisMOSH, Cost, MOSHShoreline: Distance to foraging areas for piping plovers (foraging shoreline, cost mask, and least-cost path distance): Rockaway Peninsula, NY, 2013–2014.'
No 'webLinks' in JSON for 5d0123a5e4b0573a18f7d21b.
UPDATED XML: /Volumes/stor/Projects/DeepDive/5_datarelease_packages/vol1_v4b_4sb/Rockaway Peninsula, NY, 2013–2014/DisMOSH, Cost, MOSHShoreline: Distance to foraging areas for piping plovers (foraging shoreline, cost mask, and least-cost path distance): Rockaway Peninsula, NY, 2013–2014/Rock14_DisMOSH_Cost_MOSHShoreline_meta.xml

This should be taking place in the update_all_xmls() function. The find_or_create_child() call does not find the appropriate page, so it creates it. The page would have been named from the folder name, which in the current version, would be named from the XML title. Reasons it may not be identifying the correct page: