Closed zacdezgeo closed 1 year ago
Fixes #33
Thanks for the PR, left one comment above.
@m-mohr, let me know if you want any other changes and if I can help with deployment. I'll submit pgstac
afterwards.
Thanks, please also remove the Levensthein import and dependency, I think it's not used anywhere else. Afterwards this is ready for deployment. I'll take care of deployment, thanks.
Ah, I should've caught that! Just fixed @m-mohr. Thanks!
Thanks, you did not yet remove the dependency from the package.json, right?
Still getting used to node development... removed dependency!
Thanks, merged. I'll redeploy in the next days and let you know once completed.
@zacharyDez I just redeployed STAC Index. You should now be able to submit your missing entries.
Description
This PR modifies the
checkDuplicates
function to clean the URL and title strings by removing whitespaces, underscores, and hyphens before calculating the Levenshtein distance. Additionally, the function now only flags exact duplicates (Levenshtein distance = 0).Steps for Testing:
Preliminary: Insert
pystac
into Databasecommon.js
to match your database specs.Insert Sample Data: Insert a sample record (
pystac
) into yourpublic.ecosystem
table:Step 1: Confirm Successful Insert of
pgstac
server
directory and executenpm install
followed bynpm run dev
.Test with CURL: Run the following CURL command to insert the new record
pgstac
:This should successfully insert the record without flagging it as a duplicate.
Step 2: Validate Cleaning Function
Test with Modified CURL: Run the following CURL command with modified URL and title:
The URL and title are intentionally filled with whitespaces, underscores, and hyphens. The function should clean these characters and flag the record as a duplicate, given the existing
pgstac
entry.Check for Error: If the cleaning function is working correctly, this CURL command should trigger a duplicate error.