OSC / phylogatr-web

The web app for the Phylogatr Project - https://phylogatr.org/
https://phylogatr.org/
MIT License
0 stars 0 forks source link

majority of bold records are being filtered #53

Open johrstrom opened 2 years ago

johrstrom commented 2 years ago

While working on #52 - I got this report. Apparently something like 82% of bold records are being marked as invalid.

- input_records: 585125
  invalid_records: 479287
  invalid_taxons: 494
  invalid_occurences: 0
  output_files: 105344
  name: add_bold_records
  time: '2022-02-23T16:59:10-05:00'

They're getting filtered here. https://github.com/OSC/phylogatr-web/blob/2dcfb1d79e6efb3d92697c08a863887735632448/lib/tasks/pipeline.rake#L317-L321